Python

Python基础-25 JSONPath用法

2022-08-04  本文已影响0人  Surpassme

25 使用Python处理JSON数据

25.1 JSON简介

25.1.1 什么是JSON

    JSON全称为JavaScript Object Notation,一般翻译为JS标记,是一种轻量级的数据交换格式。是基于ECMAScript的一个子集,采用完全独立于编程语言的文本格式来存储和表示数据。简洁和清晰的层次结构使得JSON成为理想的数据交换语言,其主要特点有:易于阅读易于机器生成有效提升网络速度等。

25.1.2 JSON的两种结构

    JSON简单来说,可以理解为JavaScript中的数组对象,通过这两种结构,可以表示各种复杂的结构。

25.1.2.1 数组

    数组在JavaScript是使用中括号[ ]来定义的,一般定义格式如下所示:

let array=["Surpass","28","Shanghai"];

    若要对数组取值,则需要使用索引。元素的类型可以是数字字符串数组对象等。

25.1.2.2 对象

    对象在JavaScript是使用大括号{ }来定义的,一般定义格式如下所示:

let personInfo={
  name:"Surpass",
  age:28,
  location:"Shanghai"
}

    对象一般是基于keyvalue,在JavaScript中,其取值方式也非常简单variable.key即可。元素value的类型可以是数字字符串数组对象等。

25.1.3 支持的数据格式

    JSON支持的主要数据格式如下所示:

    多个数据之间使用逗号做为分隔符,基与Python中的数据类型对应表如下所示:

JSON Python
Object dict
array list
string str
number(int) int
number(real) float
true True
false False
null None

25.2 Python对JSON的支持

25.2.1 Python 和 JSON 数据类型

    在Python中主要使用json模块来对JSON数据进行处理。在使用前,需要导入json模块,用法如下所示:

import json

    json模块中主要包含以下四个操作函数,如下所示:

25-01PythonJson模拟操作json数据.png

    在json的处理过种中,Python中的原始类型与JSON类型会存在相互转换,具体的转换表如下所示:

Python JSON
dict Object
list array
tuple array
str string
int number
float number
True true
False false
None null
JSON Python
Object dict
array list
string str
number(int) int
number(real) float
true True
false False
null None

25.2.2 json模块常用方法

    关于Python 内置的json模块,可以查看之前我写的文章:https://www.cnblogs.com/surpassme/p/13034972.html

25.3 使用JSONPath处理JSON数据

    内置的json模块,在处理简单的JSON数据时,易用且非常非常方便,但在处理比较复杂且特别大的JSON数据,还是有一些费力,今天我们使用一个第三方的工具来处理JSON数据,叫JSONPath

25.3.1 什么是JSONPath

    JSONPath是一种用于解析JSON数据的表达语言。经常用于解析和处理多层嵌套的JSON数据,其用法与解析XML数据的XPath表达式语言非常相似。

25.3.2 安装

    安装方法如下所示:

# pip install -U jsonpath

25.3.3 JSONPath语法

    JSONPath语法与XPath非常相似,其对应参照表如下所示:

XPath JSONPath 描述
/ $ 根节点/元素
. @ 当前节点/元素
/ . or [] 子元素
.. n/a 父元素
// .. 递归向下搜索子元素
* * 通配符,表示所有元素
@ n/a 访问属性,JSON结构的数据没有这种属性
[] [] 子元素操作符(可以在里面做简单的迭代操作,如数据索引,根据内容选值等)
| [,] 支持迭代器中做多选
n/a [start :end :step] 数组分割操作
[] ?() 筛选表达式
n/a () 支持表达式计算
() n/a 分组,JSONPath不支持 A

以上内容可查阅官方文档:https://goessner.net/articles/JsonPath/

    我们以下示例数据为例,来进行对比,如下所示:

{ "store": 
  {
    "book": [ 
      { "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      { "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      { "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      { "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
      "color": "red",
      "price": 19.95
    }
  }
}
XPath JSONPath 结果
/store/book/author $.store.book[*].author 获取book节点中所有author
//author $..author 获取所有author
/store/* $.store.* 获取store的元素,包含book和bicycle
/store//price $.store..price 获取store中的所有price
//book[3] $..book[2] 获取第三本书所有信息
//book[last()] $..book[(@.length-1)] </br> $..book[-1:] 获取最后一本书的信息
//book[position()<3] $..book[0,1] </br> $..book[:2] 获取前面的两本书
//book[isbn] $..book[?(@.isbn)] 根据isbn进行过滤
//book[price<10] $..book[?(@.price<10)] 根据price进行筛选
//* $..* 所有元素

在XPath中,下标是1开始,而在JSONPath中是从0开始

JSONPath在线练习网址:http://jsonpath.com/

25.3.4 JSONPath用法

    其基本用法形式如下所示:

jsonPath(obj, expr [, args])

    基参数如下所示:

    JSON数据对象

    JSONPath表达式

    改变输出格式,比如是输出是值还是路径,

args.resultType可选的输出格式为:"VALUE"、"PATH"、"IPATH"

    若返回array,则代表成功匹配到数据,false则代表未匹配到数据。

25.3.5 在Python中的使用

from jsonpath import  jsonpath
import json

data = {
    "store":
        {
            "book": [
                {
                    "category": "reference",
                    "author": "Nigel Rees",
                    "title": "Sayings of the Century",
                    "price": 8.95
                },
                {
                    "category": "fiction",
                    "author": "Evelyn Waugh",
                    "title": "Sword of Honour",
                    "price": 12.99
                },
                {
                    "category": "fiction",
                    "author": "Herman Melville",
                    "title": "Moby Dick",
                    "isbn": "0-553-21311-3",
                    "price": 8.99
                },
                {
                    "category": "fiction",
                    "author": "J. R. R. Tolkien",
                    "title": "The Lord of the Rings",
                    "isbn": "0-395-19395-8",
                    "price": 22.99
                }
            ],
            "bicycle": {
                "color": "red",
                "price": 19.95
            }
        }
}

#  获取book节点中所有author
getAllBookAuthor=jsonpath(data,"$.store.book[*].author")
print(f"getAllBookAuthor is :{json.dumps(getAllBookAuthor,indent=4)}")
#  获取book节点中所有author
getAllAuthor=jsonpath(data,"$..author")
print(f"getAllAuthor is {json.dumps(getAllAuthor,indent=4)}")
#  获取store的元素,包含book和bicycle
getAllStoreElement=jsonpath(data,"$.store.*")
print(f"getAllStoreElement is {json.dumps(getAllStoreElement,indent=4)}")
# 获取store中的所有price
getAllStorePriceA=jsonpath(data,"$[store]..price")
getAllStorePriceB=jsonpath(data,"$.store..price")
print(f"getAllStorePrictA is {getAllStorePriceA}\ngetAllStorePriceB is {getAllStorePriceB}")
# 获取第三本书所有信息
getThirdBookInfo=jsonpath(data,"$..book[2]")
print(f"getThirdBookInfo is {json.dumps(getThirdBookInfo,indent=4)}")
# 获取最后一本书的信息
getLastBookInfo=jsonpath(data,"$..book[-1:]")
print(f"getLastBookInfo is {json.dumps(getLastBookInfo,indent=4)}")
# 获取前面的两本书
getFirstAndSecondBookInfo=jsonpath(data,"$..book[:2]")
print(f"getFirstAndSecondBookInfo is {json.dumps(getFirstAndSecondBookInfo,indent=4)}")
#  根据isbn进行过滤
getWithFilterISBN=jsonpath(data,"$..book[?(@.isbn)]")
print(f"getWithFilterISBN is {json.dumps(getWithFilterISBN,indent=4)}")
# 根据price进行筛选
getWithFilterPrice=jsonpath(data,"$..book[?(@.price<10)]")
print(f"getWithFilterPrice is {json.dumps(getWithFilterPrice,indent=4)}")
# 所有元素
getAllElement=jsonpath(data,"$..*")
print(f"getAllElement is {json.dumps(getAllElement,indent=4)}")
# 未能匹配到元素时
noMatchElement=jsonpath(data,"$..surpass")
print(f"noMatchElement is {noMatchElement}")
# 调整输出格式
controlleOutput=jsonpath(data,expr="$..author",result_type="PATH")
print(f"controlleOutput is {json.dumps(controlleOutput,indent=4)}")

    最终输出结果如下扬尘:

getAllBookAuthor is :[
    "Nigel Rees",
    "Evelyn Waugh",
    "Herman Melville",
    "J. R. R. Tolkien"
]
getAllAuthor is [
    "Nigel Rees",
    "Evelyn Waugh",
    "Herman Melville",
    "J. R. R. Tolkien"
]
getAllStoreElement is [
    [
        {
            "category": "reference",
            "author": "Nigel Rees",
            "title": "Sayings of the Century",
            "price": 8.95
        },
        {
            "category": "fiction",
            "author": "Evelyn Waugh",
            "title": "Sword of Honour",
            "price": 12.99
        },
        {
            "category": "fiction",
            "author": "Herman Melville",
            "title": "Moby Dick",
            "isbn": "0-553-21311-3",
            "price": 8.99
        },
        {
            "category": "fiction",
            "author": "J. R. R. Tolkien",
            "title": "The Lord of the Rings",
            "isbn": "0-395-19395-8",
            "price": 22.99
        }
    ],
    {
        "color": "red",
        "price": 19.95
    }
]
getAllStorePrictA is [8.95, 12.99, 8.99, 22.99, 19.95]
getAllStorePriceB is [8.95, 12.99, 8.99, 22.99, 19.95]
getThirdBookInfo is [
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    }
]
getLastBookInfo is [
    {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
    }
]
getFirstAndSecondBookInfo is [
    {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
    },
    {
        "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
    }
]
getWithFilterISBN is [
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    },
    {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
    }
]
getWithFilterPrice is [
    {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
    },
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    }
]
getAllElement is [
    {
        "book": [
            {
                "category": "reference",
                "author": "Nigel Rees",
                "title": "Sayings of the Century",
                "price": 8.95
            },
            {
                "category": "fiction",
                "author": "Evelyn Waugh",
                "title": "Sword of Honour",
                "price": 12.99
            },
            {
                "category": "fiction",
                "author": "Herman Melville",
                "title": "Moby Dick",
                "isbn": "0-553-21311-3",
                "price": 8.99
            },
            {
                "category": "fiction",
                "author": "J. R. R. Tolkien",
                "title": "The Lord of the Rings",
                "isbn": "0-395-19395-8",
                "price": 22.99
            }
        ],
        "bicycle": {
            "color": "red",
            "price": 19.95
        }
    },
    [
        {
            "category": "reference",
            "author": "Nigel Rees",
            "title": "Sayings of the Century",
            "price": 8.95
        },
        {
            "category": "fiction",
            "author": "Evelyn Waugh",
            "title": "Sword of Honour",
            "price": 12.99
        },
        {
            "category": "fiction",
            "author": "Herman Melville",
            "title": "Moby Dick",
            "isbn": "0-553-21311-3",
            "price": 8.99
        },
        {
            "category": "fiction",
            "author": "J. R. R. Tolkien",
            "title": "The Lord of the Rings",
            "isbn": "0-395-19395-8",
            "price": 22.99
        }
    ],
    {
        "color": "red",
        "price": 19.95
    },
    {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
    },
    {
        "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
    },
    {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
    },
    {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
    },
    "reference",
    "Nigel Rees",
    "Sayings of the Century",
    8.95,
    "fiction",
    "Evelyn Waugh",
    "Sword of Honour",
    12.99,
    "fiction",
    "Herman Melville",
    "Moby Dick",
    "0-553-21311-3",
    8.99,
    "fiction",
    "J. R. R. Tolkien",
    "The Lord of the Rings",
    "0-395-19395-8",
    22.99,
    "red",
    19.95
]
noMatchElement is False
controlleOutput is [
    "$['store']['book'][0]['author']",
    "$['store']['book'][1]['author']",
    "$['store']['book'][2]['author']",
    "$['store']['book'][3]['author']"
]
上一篇下一篇

猜你喜欢

热点阅读