python数据生成工具faker快速入门

2020-04-29  本文已影响0人  python测试开发

简介

faker可以为你生成假数据。无论你需要引导你的数据库,创建好看的XML文档,还是对从生产服务中获取的数据进行脱敏,Faker都是适合你。

安装

pip install faker

基本用法

>>> from faker import Faker
>>> fake = Faker()
>>> fake.name()
'Lance Yates'
>>> fake.address()
'848 Edwin Stravenue Apt. 448\nLoriburgh, ND 43095'
>>> fake.text()
'Mean economy fight organization else throughout sport. Over Congress also rich should mind. Candidate factor investment national behind stock do.'
>>> fake = Faker('zh_CN')
>>> fake.name()
'周磊'
>>> fake.address()
'河北省霞县静安刘街D座 151159'
>>> fake.text()
'可以很多完成进入起来项目以及.主题同时现在项目来源.\n责任系统更多作者根据质量控制.业务留言正在一个起来目前记者.\n继续其中觉得其实语言.行业重要文件环境发布继续.最后各种其他发布.来自电脑教育社会.\n你的企业他们应该.功能东西城市到了准备.出现自己显示搜索汽车一起.\n中心经验公司进入所有地区.社区名称登录服务其中.男人决定在线只是活动.'

>>> for _ in range(10):
...   print(fake.name())
... 
徐辉
卢瑜
崔红梅
苏玉兰
苏兰英
杨雷
许淑兰
鲁佳
张帆
李倩

Providers

每一个生成器的属性(name, address和lorem)都被称为 "fake"。faker生成器有很多,打包成 "providers"。

>>> from faker import Faker
>>> from faker.providers import internet
>>> fake = Faker()
>>> fake.add_provider(internet)
>>> print(fake.ipv4_private())
172.29.38.251

本地化

faker.Faker可以接受一个locale作为参数,以返回本地化的数据。如果没有找到本地化的提供者,工厂会返回到默认的en_US locale。

>>> from faker import Faker
>>> from faker.providers import internet
>>> fake = Faker()
>>> fake.add_provider(internet)
>>> print(fake.ipv4_private())
172.29.38.251
>>> 
>>> from faker import Faker
>>> fake = Faker('it_IT')
>>> print(fake.name())
Gian Corbo 
>>> 
>>> from faker import Faker
>>> fake = Faker(['it_IT', 'en_US', 'ja_JP'])
>>> print(fake.name())
Elizabeth Green
>>> print(fake.name())
Coriolano Liverotti
>>> print(fake.name())
井上 稔

目前支持的语言如下:

ar_EG - Arabic (Egypt)
ar_PS - Arabic (Palestine)
ar_SA - Arabic (Saudi Arabia)
bg_BG - Bulgarian
bs_BA - Bosnian
cs_CZ - Czech
de_DE - German
dk_DK - Danish
el_GR - Greek
en_AU - English (Australia)
en_CA - English (Canada)
en_GB - English (Great Britain)
en_IN - English (India)
en_NZ - English (New Zealand)
en_US - English (United States)
es_ES - Spanish (Spain)
es_MX - Spanish (Mexico)
et_EE - Estonian
fa_IR - Persian (Iran)
fi_FI - Finnish
fr_FR - French
hi_IN - Hindi
hr_HR - Croatian
hu_HU - Hungarian
hy_AM - Armenian
it_IT - Italian
ja_JP - Japanese
ka_GE - Georgian (Georgia)
ko_KR - Korean
lt_LT - Lithuanian
lv_LV - Latvian
ne_NP - Nepali
nl_NL - Dutch (Netherlands)
no_NO - Norwegian
pl_PL - Polish
pt_BR - Portuguese (Brazil)
pt_PT - Portuguese (Portugal)
ro_RO - Romanian
ru_RU - Russian
sl_SI - Slovene
sv_SE - Swedish
tr_TR - Turkish
uk_UA - Ukrainian
zh_CN - Chinese (China)
zh_TW - Chinese (Taiwan)

命令行使用

$ faker -h
usage: faker [-h] [--version] [-v] [-o output] [-l LOCALE] [-r REPEAT]
             [-s SEP] [--seed SEED] [-i [INCLUDE [INCLUDE ...]]]
             [fake] [fake argument [fake argument ...]]

faker version 4.0.3

positional arguments:
  fake                  name of the fake to generate output for (e.g. profile)
  fake argument         optional arguments to pass to the fake (e.g. the
                        profile fake takes an optional list of comma separated
                        field names as the first argument)

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v, --verbose         show INFO logging events instead of CRITICAL, which is
                        the default. These logging events provide insight into
                        localization of specific providers.
  -o output             redirect output to a file
  -l LOCALE, --lang LOCALE
                        specify the language for a localized provider (e.g.
                        de_DE)
  -r REPEAT, --repeat REPEAT
                        generate the specified number of outputs
  -s SEP, --sep SEP     use the specified separator after each output
  --seed SEED           specify a seed for the random generator so that
                        results are repeatable. Also compatible with 'repeat'
                        option
  -i [INCLUDE [INCLUDE ...]], --include [INCLUDE [INCLUDE ...]]
                        list of additional custom providers to user, given as
                        the import path of the module containing your Provider
                        class (not the provider class itself)

supported locales:

  ar_AA, ar_EG, ar_JO, ar_PS, ar_SA, bg_BG, bs_BA, cs_CZ, de, de_AT, de_CH, de_DE, dk_DK, el_CY, el_GR, en, en_AU, en_CA, en_GB, en_IE, en_IN, en_NZ, en_PH, en_TH, en_US, es, es_CA, es_ES, es_MX, et_EE, fa_IR, fi_FI, fil_PH, fr_CH, fr_FR, fr_QC, he_IL, hi_IN, hr_HR, hu_HU, hy_AM, id_ID, it_IT, ja_JP, ka_GE, ko_KR, la, lb_LU, lt_LT, lv_LV, mt_MT, ne_NP, nl_BE, nl_NL, no_NO, pl_PL, pt_BR, pt_PT, ro_RO, ru_RU, sk_SK, sl_SI, sv_SE, ta_IN, th_TH, tl_PH, tr_TR, tw_GH, uk_UA, zh_CN, zh_TW

  Faker can take a locale as an optional argument, to return localized data. If
  no locale argument is specified, the factory falls back to the user's OS
  locale as long as it is supported by at least one of the providers.
     - for this user, the default locale is en_US.

  If the optional argument locale and/or user's default locale is not available
  for the specified provider, the factory falls back to faker's default locale,
  which is en_US.

examples:

  $ faker address
  968 Bahringer Garden Apt. 722
  Kristinaland, NJ 09890

  $ faker -l de_DE address
  Samira-Niemeier-Allee 56
  94812 Biedenkopf

  $ faker profile ssn,birthdate
  {'ssn': u'628-10-1085', 'birthdate': '2008-03-29'}

  $ faker -r=3 -s=";" name
  Willam Kertzmann;
  Josiah Maggio;
  Gayla Schmitt;


$ faker -l de_DE address
Mina-Adolph-Straße 1/5
90627 Hildburghausen

$ faker profile ssn,birthdate
{'ssn': '415-39-7809', 'birthdate': datetime.date(1936, 5, 17)}

$ faker -r=3 -s=";" name -l zh_CN
李秀珍;
潘慧;
毛秀荣;

自定义Provider

from faker import Faker
fake = Faker()

# first, import a similar Provider or use the default one
from faker.providers import BaseProvider

# create new provider class
class MyProvider(BaseProvider):
    def foo(self):
        return 'bar'

# then add new provider to faker instance
fake.add_provider(MyProvider)

# now you can use:
fake.foo()
# 'bar'

参考资料

如何定制词组

from faker import Faker
fake = Faker()

my_word_list = [
'danish','cheesecake','sugar',
'Lollipop','wafer','Gummies',
'sesame','Jelly','beans',
'pie','bar','Ice','oat' ]

fake.sentence()
# 'Expedita at beatae voluptatibus nulla omnis.'

print(fake.sentence(ext_word_list=my_word_list))
# 'Oat beans oat Lollipop bar cheesecake.'

随机

生成器上的.random 属性返回用于生成值的 random.Random 实例。 对于想要影响所有faker实例的插件来说,使用这个可能会很有用。

在使用Faker进行单元测试时,经常会希望生成相同的数据集。为了方便起见,生成器还提供了seed()方法,它给共享的随机数生成器下种子。用相同版本的faker和种子调用相同的方法会产生相同的结果。

每个生成器也可以通过使用 seed_instance() 方法切换到自己的 random.Random.Random 实例,与共享的随机数生成器分开,其作用相同。比如说

请注意,由于我们不断地更新数据集,因此不能保证结果在不同的补丁版本之间是一致的。如果你在测试中对结果进行硬编码,请确保你将Faker的版本钉在补丁号上。

from faker import Faker
fake = Faker()
fake.random
fake.random.getstate()

from faker import Faker
fake = Faker()
Faker.seed(4321)

print(fake.name())
# 'Margaret Boehm'

from faker import Faker
fake = Faker()
fake.seed_instance(4321)

print(fake.name())
# 'Margaret Boehm'

上一篇 下一篇

猜你喜欢

热点阅读