imposm.parser

Concepts

To use imposm.parser you need to understand three basic concepts: Types, Callbacks and Filter

为了使用imposm.parser ,你需要理解其中的三个概念:类型,回调函数和过滤器

Types

类型

Note

In this document Node, Way, Relation with a capital refer to the OSM types and nodewayrelation refer to the Imposm types.

在本文中,node,way和relation分别对应osm数据中的node,way和relation。

OSM has three fundamental element types: Nodes, Ways and Relations. imposm.parser distinguishes the OSM Nodes between coords and nodes.

OSM的数据有三种基本的类型元素,分别是Nodes,ways和relations。imposm.parser会将Nodes区分为coords和nodes

coords only store coordinates and there are coords for every OSM Node. nodes also store tags and there are only nodes for OSM Nodes with tags.

coords只存储坐标和,而且OSM仲的每个Node都包含坐标,nodes还会存储tags,所以nodes只是指OSM数据中包含tags属性的Nodes

coords

A tuple with the OSM ID, the longitude and latitude of that node.

Coords是一个由OSM ID和纬度,经度组成的三元组,来表示一个Node

(4234432, 175.2, -32.1)

imposm.parser will return a coord for each OSM Node, even if this OSM Node is also a node (i.e. it has tags).

imposm.parser会返回每个OSM Node的coord,即使这个OSM Node也是一个包含tag的Node

nodes

A tuple with the OSM ID, a tags dictionary and a nested tuple with the longitude and latitude of that node.

Nodes是一个由OSM ID,tags的字典,以及一个经纬度的子元组组成的三元组

(982347, {'name': 'Somewhere', 'place': 'village'}, (-120.2, 23.21))

ways

A tuple with the OSM ID, a tags dictionary and a list of references.

ways是一个由OSM ID ,tags的字典以及一个引用的数组组成的三元组

(87644, {'name': 'my way', 'highway': 'path'}, [123, 345, 567])

relations

A tuple with the OSM ID, a tags dictionary and a list of member tuples. Each member tuple contains the reference, the type (one of ‘node’‘way’‘relation’) and the role.

relations是由ISM ID,tags的字典已经一个元组数组,其中每个元组都包含引用,类别以及关系。这三者组成了relations类型

(87644, {'type': 'multipolygon'}, [(123, 'way', 'outer'), (234, 'way', 'inner')])

Callbacks

The parser takes four callback functions for each data type (coordsnodesways and relations). The callbacks are optional, i.e. you don’t need to pass a relations callback if you are not interested in relations.

parser包含四个回调函数,他们的类别分别是coords,nodes,ways和relations。这些回调函数是可选的,比如如果你对relations不感兴趣,那就不需要传递relations的回调函数。

The functions should expect a list with zero or more items of the corresponding type.

这些回调函数的参数是对应类型的数组列表

Here is an example callback that prints the coordinates of all Nodes.

这是一个打印所有Nodes的坐标的回调函数的样例

def coords_callback(coords):
  for osm_id, lon, lat in coords:
    print '%s %.4f %.4f' % (osm_id, lon, lat)

Tag filters

Tag filter are functions that manipulate tag dictionaries. The functions should modify the dictionary in-place, the return value is ignored.

Tag filter是操作tag的字段的函数,这些函数应该在内部修改字典,他们的返回值是没用的,即不能通过返回值来修改字典。

Elements will be handled different, if you remove all tags from the dictionary. nodes and relations with empty tags will not be returned, but ways will be, since they might be needed for building relations.

每种不同的元素会被区别对待的处理,比如你删除了tags里面的所有元素,nodes和relations的空tags不会被返回,但是wats就会,因为他们可能被建筑之间的关系所需要。

Here is an example filter that filters the tags with a whitelist.

下面一个通过白名单过滤标签的样例,它会把白名单之外的标签都删除

whitelist = set(('name', 'place', 'amenity'))

def tag_filter(tags):
  for key in tags.keys():
    if key not in whitelist:
      del tags[key]
  if 'name' in tags and len(tags) == 1:
    # tags with only a name have no information
    # how to handle this element
    del tags['name']

Parsing API

Imposm comes with a single OSMParser class that implements a simple to use, callback-based parser for OSM files.

Imposm只有一个OSMParser类,它用起来非常简单,直接根据OSM文件来调用对应的处理过程

It supports XML and PBF files. It also supports BZip2 compressed XML files.

它支持XML文件和PBF(google protobuf)格式的文件,也只是压缩的xml文件(格式必须是BZIP2的)

Concurrency 并发

The parser uses multiprocessing to distribute the parsing across multiple CPUs. This does work with PBF as well as XML files.

这个parser使用multiprocessing来分布式的在多个CPU上解析数据。

You can pass the concurrency as an argument to OSMParser and it defaults to the number of CPU and cores of the host system.

 concurrency defines the number of parser processes. The main process where the callbacks are handled and the decompression

(if you have a .bzip2 file) are handled in additional processes. So you might get better results if you reduce this number on

systems with more than two cores.

You can double the number on systems with hyper threading CPUs.

你可以传递concurrency参数给OSMParser,它默认使用本地系统的CPU数量做并发数量。concurrency参数定义了解析器的处理器数量,如果是压缩文件,那

解压还是由主进程来进行的。所以如果你在超过2个核的系统上,你最好不要用满。

如果你的系统支持超线程,那你可以把这个参数设置为物理核的2倍。

API

class  imposm.parser. OSMParser ( concurrency=Nonenodes_callback=Noneways_callback=Nonerelations_callback=Nonecoords_callback=Nonenodes_tag_filter=Noneways_tag_filter=Nonerelations_tag_filter=Nonemarshal_elem_data=False )

High-level OSM parser.

Parameters:
  • concurrency – number of parser processes to start. Defaults to the number of CPUs. 希望用于处理数据的 CPU数量
  • xxx_callback – callback functions for coords, nodes, ways and relations. Each callback function gets called with a list of multiple elements. See callback concepts. 回调函数,有四种类型,自己定义
  • xxx_filter – functions that can manipulate the tag dictionary. Nodes and relations without tags will not passed to the callback. See tag filter concepts. 过滤器
parse ( filename )

Parse the given file. Detects the filetype based on the file suffix. Supports .pbf.osm and .osm.bz2.

parse_pbf_file ( filename )

Parse a PBF file.

parse_xml_file ( filename )

Parse a XML file. Supports BZip2 compressed files if the filename ends with .bz2.

你可能感兴趣的:(openstreetmap,osm,planet.osm,imposm.parser)