pyexcel - Let you focus on data, instead of file formats

Author:C.W.
Source code:http://github.com/pyexcel/pyexcel.git
Issues:http://github.com/pyexcel/pyexcel/issues
License:New BSD License
Development:0.4.5
Released:0.4.4
Generated:Apr 19, 2017

Introduction

pyexcel provides one application programming interface to read, manipulate and write data in different excel formats. This library makes information processing involving excel files an enjoyable task. The data in excel files can be turned into array or dict with least code, vice versa. This library focuses on data processing using excel files as storage media hence fonts, colors and charts were not and will not be considered.

The idea originated from the common usability problem when developing an excel file driven web applications for non-technical office workers: such as office assistant, human resource administrator. The fact is that not all people know the difference among various excel formats: csv, xls, xlsx. Instead of training those people about file formats, this library helps web developers to handle most of the excel file formats by providing a common programming interface. To add a specific excel file format to you application, all you need is to install an extra pyexcel plugin. No code change to your application. Looking at the community, this library and its associated ones try to become a small and easy to install alternative to Pandas.

Note

Since version 0.2.2, no longer a plugin should be explicitly imported. They are imported if they are installed. Please use pip to manage the plugins.

Installation

You can install it via pip:

$ pip install pyexcel

or clone it and install it:

$ git clone http://github.com/pyexcel/pyexcel.git
$ cd pyexcel
$ python setup.py install

For individual excel file formats, please install them as you wish:

A list of file formats supported by external plugins
Package name Supported file formats Dependencies Python versions
pyexcel-io csv, csvz [1], tsv, tsvz [2]   2.6, 2.7, 3.3, 3.4, 3.5, 3.6 pypy
pyexcel-xls xls, xlsx(read only), xlsm(read only) xlrd, xlwt same as above
pyexcel-xlsx xlsx openpyxl same as above
pyexcel-xlsxw xlsx(write only) XlsxWriter same as above
pyexcel-ods3 ods ezodf, lxml 2.6, 2.7, 3.3, 3.4 3.5, 3.6
pyexcel-ods ods odfpy same as above
pyexcel-odsr ods(read only) lxml same as above
pyexcel-text (write only)json, rst, mediawiki, html, latex, grid, pipe, orgtbl, plain simple tabulate 2.6, 2.7, 3.3, 3.4 3.5, pypy, pypy3

Footnotes

[1]zipped csv file
[2]zipped tsv file

For compatibility tables of pyexcel-io plugins, please click here

Plugin compatibility table
pyexcel pyexcel-io pyexcel-text
0.4.0+ 0.3.0 0.2.5
0.3.0+ 0.2.3 0.2.4
0.2.2+ 0.2.0+ 0.2.1+
0.2.1 0.1.0 0.2.0
0.2.0 0.1.0 0.1.0+

Usage

Suppose you want to process the following excel data :

Name Age
Adam 28
Beatrice 29
Ceri 30
Dean 26

Here are the example usages:

>>> import pyexcel as pe
>>> records = pe.iget_records(file_name="your_file.xls")
>>> for record in records:
...     print("%s is aged at %d" % (record['Name'], record['Age']))
Adam is aged at 28
Beatrice is aged at 29
Ceri is aged at 30
Dean is aged at 26

Tutorial

Real world cases

Indices and tables