Utils methods

scrapper_helpers.utils.caching(key_func=<function default_key_func>)

A decorator that creates local dumps of the decorated function’s return values for given parameters. It can take a key_func argument that determines the name of the output file.

scrapper_helpers.utils.default_key_func(*args)

This method creates the default string representation of the input parameters, used for caching filename

scrapper_helpers.utils.finder(many=True, *finder_args, **finder_kwargs)
Search for an element(or elements depending on variable ‘many’) with certain key(tag, class, id, ...)
in a web page markup
Parameters:
  • many – decide whether searching for one or more elements
  • finder_args – key to find
Returns:

element or list of elements

scrapper_helpers.utils.flatten(container)

Flattens a list :param container: list with nested lists :type container: list :return: list with elements that were nested in container :rtype: list

scrapper_helpers.utils.get_number_from_string(s, number_type, default)

Returns a numeric value of number_type created from the given string or default if the cast is not possible.

scrapper_helpers.utils.get_random_user_agent()

Randoms user agent to prevent “python” user agent :return: Random user agent from USER_AGENTS :rtype: str

scrapper_helpers.utils.html_decode(s)

Returns the ASCII decoded version of the given HTML string. This does NOT remove normal HTML tags like <p>.

scrapper_helpers.utils.key_md5(*args)

This method creates an MD5 from the input parameters, used for caching filename

scrapper_helpers.utils.key_sha1(*args)

This method creates an SHA-1 from the input parameters, used for caching filename

scrapper_helpers.utils.normalize_text(text, lower=True, replace_spaces='_')

This method returns the input string, but normalizes is it for use in the url. :param text: input string :rtype: string :return: Normalized string. lowercase, no diacritics, ‘-‘ instead of ‘ ‘

scrapper_helpers.utils.replace_all(text, dic)

This method returns the input string, but replaces its characters according to the input dictionary. :param text: input string :param dic: dictionary containing the changes. key is the character that’s supposed to be changed and value is

the desired value
Return type:string
Returns:String with the according characters replaced