H.2. data_generator#
H.2. data_generator #
H.2.1. Overview #
data_generator is a python script that will generate fake data for you.
usage: populate.py [-h] --table {address,city,country,first_name,last_name,company,email,iban,lorem_ipsum,postcode}
[{address,city,country,first_name,last_name,company,email,iban,lorem_ipsum,postcode} ...] [--locales LOCALES] --output_dir OUTPUT_DIR
[--lines LINES] [--seed SEED]
Internally it uses library Faker
To produce 5000 emails in Russian & English, you’d call the scripts like this:
populate.py --table country email --locales ru_RU,en --lines 5000 --output_dir out
This will output the fake data in CSV format.
Use populate.py --help for more details about
the script parameters.
You can load the fake data directly into the extension like this:
TRUNCATE transp_anon.email;
COPY transp_anon.email
FROM
PROGRAM 'populate.py --table country email --locales ru_RU,en --lines 5000 --output_dir out';
SELECT setval('transp_anon.email_oid_seq', max(oid))
FROM transp_anon.email;
CLUSTER transp_anon.email;
H.2.2. Faker #
Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you. For more information see Faker Documentation.