text_sensitivity.data.random
Generate random data for robustness and sensitivity testing.
Submodules:
text_sensitivity.data.random.entity module
Generation of random entities (e.g. names, telephone numbers) for given languages.
- class text_sensitivity.data.random.entity.CityByPopulationMixin
Bases:
Readable
- add_likelihood_to_cities()
Add likelihood to cities, based on population.
- static cities_by_population(cities, country_code)
Add population scores to each city in a country.
- Parameters
cities (List[str]) – Current list of cities. If no replacement is found, this will be returned back.
country_code (str) – Two-letter country code (e.g. ‘nl’).
- class text_sensitivity.data.random.entity.RandomAddress(languages=<Proxy at 0x7fb578cebb40 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, likelihood_based_on_city_population=True, sep='\\n', seed=0)
Bases:
RandomEntity
,CityByPopulationMixin
Generate random cities in (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
likelihood_based_on_city_population (bool) –
sep (str) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomCity(languages=<Proxy at 0x7fb578cebdc0 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, likelihood_based_on_city_population=True, seed=0)
Bases:
RandomEntity
,CityByPopulationMixin
Generate random cities in (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
likelihood_based_on_city_population (bool) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomCountry(languages=<Proxy at 0x7fb578cee040 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, seed=0)
Bases:
RandomEntity
Generate random countries for (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomCryptoCurrency(seed=0)
Bases:
RandomEntity
Generate random cryptocurrency names.
- Parameters
seed (int) –
- class text_sensitivity.data.random.entity.RandomCurrencySymbol(seed=0)
Bases:
RandomEntity
Generate random currency symbols.
- Parameters
seed (int) –
- class text_sensitivity.data.random.entity.RandomDay(seed=0)
Bases:
RandomEntity
Generate random day of the month.
- Parameters
seed (int) –
- class text_sensitivity.data.random.entity.RandomDayOfWeek(languages=<Proxy at 0x7fb578ceefc0 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, seed=0)
Bases:
RandomEntity
Generate random day of week in (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomEmail(languages=<Proxy at 0x7fb578cee8c0 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, seed=0)
Bases:
RandomEntity
Generate random e-mail addresses for (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomEntity(languages=<Proxy at 0x7fb5790b4c40 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, providers=['person'], fn_name='name', attribute='fn', attribute_rename=None, sep='\\n', seed=0)
Bases:
Readable
,SeedMixin
,CaseMixin
Base class to generate entity data for (a) given language(s).
Example
Generate a 10 random English names entity using package faker:
>>> RandomEntity(locale='en', providers=['person'], fn_name='name').generate_list(n=10)
- Parameters
languages (Union[str, List[str]], optional) – Languages to generate data from. Defaults to your current locale (see get_locale()).
providers (List[str], optional) – Providers from faker used in generation. Defaults to [‘person’].
fn_name (Union[str, List[str]], optional) – Function name(s) to call for each generator. Defaults to ‘name’.
attribute (str, optional) – Name of additional attribute (other than language). Defaults to ‘fn’.
attribute_rename (Optional[Callable[[str], str]], optional) – Rename function for attribute value. Defaults to None.
sep (str, optional) – Separator to replace ‘n’ character with. Defaults to ‘n’.
seed (int, optional) – Seed for reproducibility. Defaults to 0.
- generate(n, attributes=False)
Generate n instances of random data.
- Parameters
n (int) – Number of instances to generate.
attributes (bool, optional) – Include attributes (language, which function was used, etc.) or not. Defaults to False.
- Returns
Provider containing generated instances (if attributes = False). Tuple[TextInstanceProvider, Dict[str, MemoryLabelProvider]]: Provider and corresponding attribute
labels (if attributes = True).
- Return type
TextInstanceProvider
- generate_list(n, attributes=False)
Generate n instances of random data and return as list.
- Parameters
n (int) – Number of instances to generate.
attributes (bool, optional) – Include attributes (language, which function was used, etc.) or not. Defaults to False.
- Returns
Generated instances (if attributes = False). Tuple[List[str], Dict[str, str]]: Generated instances and corresponding attributes (if attributes = True).
- Return type
List[str]
- class text_sensitivity.data.random.entity.RandomFirstName(languages=<Proxy at 0x7fb578cee480 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, sex=['male', 'female'], seed=0)
Bases:
RandomEntity
Generate random first names for (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
sex (List[str]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomLastName(languages=<Proxy at 0x7fb578cee6c0 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, seed=0)
Bases:
RandomEntity
Generate random last names for (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomLicensePlate(seed=0)
Bases:
RandomEntity
Generate random license plates for a given country.
- Parameters
seed (int) –
- class text_sensitivity.data.random.entity.RandomMonth(languages=<Proxy at 0x7fb578ceed40 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, seed=0)
Bases:
RandomEntity
Generate random month name in (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomName(languages=<Proxy at 0x7fb578cee240 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, sex=['male', 'female'], seed=0)
Bases:
RandomEntity
Generate random full names for (a) given language(s).
- Parameters
languages (Union[str, List[str]]) –
sex (List[str]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomPhoneNumber(languages=<Proxy at 0x7fb578ceeac0 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, seed=0)
Bases:
RandomEntity
Generate random phone numbers for (a) given language(s) / country.
- Parameters
languages (Union[str, List[str]]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomPriceTag(languages=<Proxy at 0x7fb578cf3200 wrapping 'nl' at 0x7fb5a0dea5b0 with factory <function lazy.<locals>.<lambda>>>, seed=0)
Bases:
RandomEntity
Generate random pricetag names in (a) given languages’ currency.
- Parameters
languages (Union[str, List[str]]) –
seed (int) –
- class text_sensitivity.data.random.entity.RandomYear(seed=0)
Bases:
RandomEntity
Generate random year.
- Parameters
seed (int) –
text_sensitivity.data.random.string module
Generate random strings from characters/strings.
- class text_sensitivity.data.random.string.RandomAscii(seed=0)
Bases:
RandomString
Generate random ASCII characters.
- Parameters
seed (int) –
- class text_sensitivity.data.random.string.RandomCyrillic(languages='ru', upper=True, lower=True, seed=0)
Bases:
RandomString
Generate containing random Cyrillic characters.
Can generate text in Bulgarian (‘bg’), Macedonian (‘mk’), Russian (‘ru’), Serbian (‘sr’), Ukrainian (‘uk’), and all combinations thereof.
- Parameters
languages (Union[List[str], str], optional) – Cyrillic languages to select. Defaults to ‘ru’.
upper (bool, optional) – Whether to include
seed (int, optional) – Seed for reproducibility. Defaults to 0.
lower (bool) –
- Raises
ValueError – Either upper or lower should be True.
ValueError – One of the selected languages is unknown.
- class text_sensitivity.data.random.string.RandomDigits(seed=0)
Bases:
RandomString
Generate strings containing random digits.
- Parameters
seed (int) –
- class text_sensitivity.data.random.string.RandomEmojis(seed=0, base=True, dingbats=True, flags=True, components=True)
Bases:
RandomString
Generate strings containing a subset of random unicode emojis.
- Parameters
seed (int, optional) – Seed for reproducibility. Defaults to 0.
base (bool, optional) – Include base emojis (e.g. smiley face). Defaults to True.
dingbats (bool, optional) – Include dingbat emojis. Defaults to True.
flags (bool, optional) – Include flag emojis. Defaults to True.
components (bool, optional) – Include emoji components (e.g. skin color modifier or country flags). Defaults to True.
- Raises
ValueError – At least one of base, dingbats, flags should be True.
- class text_sensitivity.data.random.string.RandomLower(seed=0)
Bases:
RandomString
Generate random ASCII lowercase characters.
- Parameters
seed (int) –
- class text_sensitivity.data.random.string.RandomPunctuation(seed=0)
Bases:
RandomString
Generate strings containing random punctuation characters.
- Parameters
seed (int) –
- class text_sensitivity.data.random.string.RandomSpaces(seed=0)
Bases:
RandomString
Generate strings with a random number of spaces.
- Parameters
seed (int) –
- class text_sensitivity.data.random.string.RandomString(seed=0, options='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\\'()*+, -./:;<=>?@[\\\\]^_`{|}~ \\t\\n\\r\\x0b\\x0c')
Bases:
Readable
,SeedMixin
Base class for random data (string) generation.
- Parameters
seed (int, optional) – Seed for reproducibility. Defaults to 0.
options (Union[str, List[str]], optional) – Characters or strings to generate data from. Defaults to string.printable.
- generate(n, min_length=0, max_length=100)
Generate n instances of random strings.
Example
Create a TextInstanceProvider containing n=10 strings of random characters from ‘12345xXyY!?’ between length 3 and 10:
>>> RandomString(seed=0, options='12345xXyY!?').generate_list(n=10, min_length=3, max_length=10)
- Parameters
n (int) – Number of instances to generate.
min_length (int, optional) – Minimum length of random instance. Defaults to 0.
max_length (int, optional) – Maximum length of random instance. Defaults to 100.
- Raises
ValueError – min_length should be smaller than max_length.
- Returns
Provider containing generated instances.
- Return type
TextInstanceProvider
- generate_list(n, min_length=0, max_length=100)
Generate n instances of random strings and return as list.
Example
Generate a list of random characters from u’ABCabcU0001F600’ between length 10 and 50 (n=10 strings):
>>> RandomString(seed=0, options=u'ABCabc\U0001F600').generate_list(n=10, min_length=10, max_length=50)
- Parameters
n (int) – Number of instances to generate.
min_length (int, optional) – Minimum length of random instance. Defaults to 0.
max_length (int, optional) – Maximum length of random instance. Defaults to 100.
- Raises
ValueError – min_length should be smaller than max_length.
- Returns
List containing generated instances.
- Return type
List[str]
- class text_sensitivity.data.random.string.RandomUpper(seed=0)
Bases:
RandomString
Generate random ASCII uppercase characters.
- Parameters
seed (int) –
- class text_sensitivity.data.random.string.RandomWhitespace(seed=0)
Bases:
RandomString
Generate strings with a random number whitespace characters.
- Parameters
seed (int) –
- text_sensitivity.data.random.string.combine_generators(*generators, seed=None)
Combine muliple random string generators into one.
- Parameters
*generators – Generators to combine.
seed (Optional[int]) – Seed value for new generator. If None picks a random seed from the generators. Defaults to None.
- Return type
Example
Make a generator that generates random punctuation, emojis and ASCII characters:
>>> new_generator = combine_generators(RandomPunctuation(), RandomEmojis(), RandomAscii())
- Returns
Generator with all generator options combined.
- Return type
- Parameters
seed (Optional[int]) –