Media Download Module

This module provides functionality for media content download using Selenium, given ads data urls and a valid access token.

download_media Function

AdDownloader.media_download.download_media(media_url, media_type, ad_id, media_folder)[source]

Download media content for an ad given its ID.

Parameters:
  • media_url (str) – The url address for accessing the media content.

  • media_type (str) – The type of the media content to download, can be ‘image’ or ‘videos’.

  • ad_id (str) – The ID of the ad for which media content is downloaded.

  • media_folder (str) – The path to the folder where media content will be saved.

Example:

>>> driver.get(data['ad_snapshot_url'][0])
>>> img_element = driver.find_element(By.XPATH, img_xpath)
>>> media_url = img_element.get_attribute('src')
>>> media_type = 'image'
>>> download_media(media_url, media_type, str(data['id'][i]), folder_path_img)

accept_cookies Function

AdDownloader.media_download.accept_cookies(driver)[source]

Accept the cookies in a running Chrome webdriver. Only needs to be done once, when openning the webdriver.

Parameters:

driver (webdriver.Chrome) – A running Chrome webdriver.

Example:

>>> driver = webdriver.Chrome()
>>> driver.get(data['ad_snapshot_url'][0]) # start from here to accept cookies
>>> accept_cookies(driver)

start_media_download Function

AdDownloader.media_download.start_media_download(project_name, nr_ads, data=[])[source]

Start media content download for a given project and desired number of ads. The ads media are saved in the output folder with the project_name.

Parameters:
  • project_name (str) – The name of the current project.

  • nr_ads (int) – The desired number of ads for which media content should be downloaded.

  • data (pandas.DataFrame) – A dataframe containing an ad_snapshot_url column.

Example:

>>> start_media_download(project_name = "test1", nr_ads = 20, data = data)

extract_frames Function

AdDownloader.media_download.extract_frames(video, project_name, interval=None, num_frames=None)[source]

Extract a number of frames from ad videos

Parameters:
  • video (str) – The name of the video for which frames should be extracted.

  • project_name (str) – The name of the current project.

  • interval (int) – The interval between the (in seconds), optional. Should be specified instead of num_frames.

  • num_frames (int) – The number of frames to extract, distributed evenly, optional. Should be specified instead of the interval.

Example:

>>> extract_frames(video = "test_video.mp4", project_name = "test1", interval = 3)