Bits bulbs (#92)

* 📚 update doc strings and plugin compactibility list

* 🔥 remove PY2 related code and update docs

* 🔬 more test coverage
This commit is contained in:
jaska 2020-10-06 22:13:56 +01:00 committed by GitHub
parent 9517e0bf49
commit a923bce3be
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
14 changed files with 146 additions and 91 deletions

View File

@ -57,6 +57,7 @@ get_data(.., library='pyexcel-ods')
============= ======= ======== ======= ======== ======== ========
`pyexcel-io`_ `xls`_ `xlsx`_ `ods`_ `ods3`_ `odsr`_ `xlsxw`_
============= ======= ======== ======= ======== ======== ========
0.6.0+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.10+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.1+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+
0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x
@ -91,7 +92,6 @@ get_data(.., library='pyexcel-ods')
csvz
sqlalchemy
django
options
extensions
@ -108,6 +108,7 @@ API
.. autosummary::
:toctree: api/
iget_data
get_data
save_data

View File

@ -1,5 +1,5 @@
pyexcel_io.get_data
===================
pyexcel\_io.get\_data
=====================
.. currentmodule:: pyexcel_io

View File

@ -0,0 +1,6 @@
pyexcel\_io.iget\_data
======================
.. currentmodule:: pyexcel_io
.. autofunction:: iget_data

View File

@ -1,5 +1,5 @@
pyexcel_io.save_data
====================
pyexcel\_io.save\_data
======================
.. currentmodule:: pyexcel_io

View File

@ -2,9 +2,26 @@ Common parameters
================================================================================
'library' option is added
--------------------------------------------------------------------------------
In order to have overlapping plugins co-exit, 'library' option is added to
get_data and save_data.
get_data only parameters
-------------------------------
keep_trailing_empty_cells
********************************************************************************
default: False
If turned on, the return data will contain trailing empty cells.
auto_dectect_datetime
--------------------------------------------------------------------------------
********************************************************************************
The datetime formats are:
@ -14,11 +31,6 @@ The datetime formats are:
Any other datetime formats will be thrown as ValueError
'library' option is added
--------------------------------------------------------------------------------
In order to have overlapping plugins co-exit, 'library' option is added to
get_data and save_data.
csv only parameters
--------------------------------------------------------------------------------

View File

@ -1,8 +1,12 @@
Extend pyexcel-io Tutorial
================================================================================
You are welcome extend pyexcel-io to read and write more tabular formats. In
github repo, you will find two examples in `examples` folder. This section
You are welcome toextend pyexcel-io to read and write more tabular formats.
No. 1 rule, your plugin must have a prefix 'pyexcel_' in its module path.
For example, `pyexcel-xls` has 'pyexcel_xls' as its module path. Otherwise,
pyexcel-io will not load your plugin.
On github, you will find two examples in `examples` folder. This section
explains its implementations to help you write yours.
.. note::
@ -10,7 +14,7 @@ explains its implementations to help you write yours.
No longer, you will need to do explicit imports for pyexcel-io extensions.
Instead, you install them and manage them via pip.
Reader
Simple Reader for a yaml file
--------------------------------------------------------------------------------
Suppose we have a yaml file, containing a dictionary where the values are
@ -60,7 +64,7 @@ files on physical disk. "memory" means a file stream. "content" means a string b
:lines: 36-41
**Test your reader **
**Test your reader**
Let's run the following code and see if it works.
@ -68,13 +72,21 @@ Let's run the following code and see if it works.
:language: python
:lines: 43-45
Writer
You would see these in standard output:
.. code-block:: bash
$ python custom_yaml_reader.py
OrderedDict([('sheet 1', [[1, 2, 3], [2, 3, 4]]), ('sheet 2', [['A', 'B', 'C']])])
A writer to write content in yaml
--------------------------------------------------------------------------------
Now for the writer, let's write a pyexcel-io writer that write a dictionary of
two dimentaional arrays back into a yaml file seen above.
** Implement IWriter **
**Implement IWriter**
Two abstract functions are required:
@ -85,7 +97,7 @@ Two abstract functions are required:
:language: python
:lines: 18-30
** Implement ISheetWriter **
**Implement ISheetWriter**
It is imagined that you will have your own sheet writer. You simply need to figure
out how to write a row. Row by row write action was already written by `ISheetWrier`.
@ -111,8 +123,25 @@ Let's run the following code and please examine `mytest.yaml` yourself.
:language: python
:lines: 40-46
And you shall find a file named 'mytest.yaml':
.. code-block:: bash
$ cat mytest.yaml
sheet 1:
- - 1
- 3
- 4
- - 2
- 4
- 9
sheet 2:
- - B
- C
- D
Other pyexcel-io plugins
-----------------------------------------------------------------------------
@ -138,26 +167,6 @@ And you can also get the data back::
[[1, 2, 3]]
Work with memory file
-----------------------------------------------------------------------------
Here is the sample code to work with memory file::
>>> from pyexcel_io.manager import get_io
>>> io = get_io("xls")
>>> data = [[1,2,3]]
>>> save_data(io, data, "xls")
The difference is that you have mention file type if you use :meth:`pyexcel_io.save_data`
And you can also get the data back::
>>> data = get_data(io, "xls")
>>> data['pyexcel_sheet1']
[[1, 2, 3]]
The same applies to :meth:`pyexcel_io.get_data`.
Other formats
-----------------------------------------------------------------------------

View File

@ -159,6 +159,7 @@ get_data(.., library='pyexcel-ods')
============= ======= ======== ======= ======== ======== ========
`pyexcel-io`_ `xls`_ `xlsx`_ `ods`_ `ods3`_ `odsr`_ `xlsxw`_
============= ======= ======== ======= ======== ======== ========
0.6.0+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.10+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.1+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+
0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x
@ -193,7 +194,6 @@ get_data(.., library='pyexcel-ods')
csvz
sqlalchemy
django
options
extensions
@ -210,6 +210,7 @@ API
.. autosummary::
:toctree: api/
iget_data
get_data
save_data

View File

@ -1,11 +0,0 @@
Options
======================
Here is the documentation on the keyword options for get_data.
keep_trailing_empty_cells
------------------------------
default: False
If turned on, the return data will contain trailing empty cells.

View File

@ -1,5 +1,4 @@
Rendering(Formatting) the data
================================================================================
You might want to do custom rendering on your data obtained. `row_renderer` was

View File

@ -7,16 +7,9 @@
:copyright: (c) 2014-2020 by Onni Software Ltd.
:license: New BSD License, see LICENSE for more details
"""
# flake8: noqa
# pylint: disable=import-error
# pylint: disable=invalid-name
# pylint: disable=too-few-public-methods
# pylint: disable=ungrouped-imports
# pylint: disable=redefined-variable-type
import sys
import types
import logging
from collections import OrderedDict
from io import BytesIO, StringIO # noqa: F401
from collections import OrderedDict # noqa: F401
try:
from logging import NullHandler
@ -27,8 +20,6 @@ except ImportError:
pass
from io import BytesIO, StringIO
text_type = str
irange = range
@ -48,7 +39,4 @@ def isstream(instance):
def is_string(atype):
"""find out if a type is str or not"""
if atype == str:
return True
return False
return atype == str

View File

@ -25,6 +25,10 @@ from pyexcel_io.exceptions import (
def iget_data(afile, file_type=None, **keywords):
"""Get data from an excel file source
The data has not gone into memory yet. If you use dedicated partial read
plugins, such as pyexcel-xlsxr, pyexcel-odsr, you will notice
the memory consumption drop when you work with big files.
:param afile: a file name, a file stream or actual content
:param sheet_name: the name of the sheet to be loaded
:param sheet_index: the index of the sheet to be loaded
@ -32,9 +36,6 @@ def iget_data(afile, file_type=None, **keywords):
:param file_type: used only when filename is not a physical file name
:param force_file_type: used only when filename refers to a physical file
and it is intended to open it as forced file type.
:param streaming: toggles the type of returned data. The values of the
returned dictionary remain as generator if it is set
to True. Default is False.
:param library: explicitly name a library for use.
e.g. library='pyexcel-ods'
:param auto_detect_float: defaults to True
@ -44,6 +45,7 @@ def iget_data(afile, file_type=None, **keywords):
:param ignore_nan_text: various forms of 'NaN', 'nan' are ignored
:param default_float_nan: choose one form of 'NaN', 'nan'
:param pep_0515_off: turn off pep 0515. default to True.
:param keep_trailing_empty_cells: keep trailing columns. default to False
:param keywords: any other library specific parameters
:returns: an ordered dictionary
"""
@ -59,7 +61,10 @@ def get_data(afile, file_type=None, streaming=None, **keywords):
:param afile: a file name, a file stream or actual content
:param sheet_name: the name of the sheet to be loaded
:param sheet_index: the index of the sheet to be loaded
:param sheets: a list of sheet to be loaded
:param file_type: used only when filename is not a physial file name
:param force_file_type: used only when filename refers to a physical file
and it is intended to open it as forced file type.
:param streaming: toggles the type of returned data. The values of the
returned dictionary remain as generator if it is set
to True. Default is False.
@ -69,6 +74,10 @@ def get_data(afile, file_type=None, streaming=None, **keywords):
:param auto_detect_int: defaults to True
:param auto_detect_datetime: defaults to True
:param ignore_infinity: defaults to True
:param ignore_nan_text: various forms of 'NaN', 'nan' are ignored
:param default_float_nan: choose one form of 'NaN', 'nan'
:param pep_0515_off: turn off pep 0515. default to True.
:param keep_trailing_empty_cells: keep trailing columns. default to False
:param keywords: any other library specific parameters
:returns: an ordered dictionary
"""

View File

@ -1,5 +1,4 @@
import os
import sys
import types
from zipfile import BadZipfile
from unittest import TestCase
@ -12,8 +11,6 @@ from pyexcel_io._compact import BytesIO, StringIO, OrderedDict, is_string
from nose.tools import eq_, raises
PY2 = sys.version_info[0] == 2
@raises(IOError)
def test_directory_name_as_file():
@ -116,11 +113,8 @@ def test_load_unknown_data_from_memory():
@raises(BadZipfile)
def test_load_csvz_data_from_memory():
if not PY2:
io = StringIO()
get_data(io, file_type="csvz")
else:
raise BadZipfile("pass it")
io = StringIO()
get_data(io, file_type="csvz")
@raises(IOError)
@ -130,12 +124,9 @@ def test_write_xlsx_data():
@raises(Exception)
def test_writer_csvz_data_from_memory():
if not PY2:
io = StringIO()
writer = get_writer(io, file_type="csvz")
writer.write({"adb": [[2, 3]]})
else:
raise Exception("pass it")
io = StringIO()
writer = get_writer(io, file_type="csvz")
writer.write({"adb": [[2, 3]]})
@raises(exceptions.NoSupportingPluginFound)
@ -264,10 +255,7 @@ def test_conversion_from_bytes_to_text():
def test_is_string():
if PY2:
assert is_string(type(u"a")) is True
else:
assert is_string(type("a")) is True
assert is_string(type("a")) is True
def test_generator_is_obtained():

View File

@ -1,6 +1,5 @@
# -*- coding: utf-8 -*-
import os
import sys
import zipfile
from unittest import TestCase
@ -12,8 +11,6 @@ from pyexcel_io._compact import OrderedDict
from nose.tools import raises
PY2 = sys.version_info[0] == 2
class TestCSVZ(TestCase):
file_type = "csvz"

View File

@ -1,10 +1,17 @@
from datetime import time, datetime, timedelta
from pyexcel_io.service import (
date_value,
time_value,
boolean_value,
ods_bool_value,
ods_date_value,
ods_time_value,
ods_float_value,
throw_exception,
detect_int_value,
detect_float_value,
ods_timedelta_value,
)
from pyexcel_io.exceptions import IntegerAccuracyLossError
@ -106,3 +113,52 @@ def test_big_int_value():
@raises(IntegerAccuracyLossError)
def test_throw_exception():
throw_exception(1000000000000000)
def test_boolean_value():
fixture = ["true", "false", 1]
expected = [True, False, 1]
actual = [boolean_value(element) for element in fixture]
eq_(actual, expected)
def test_time_delta_presentation():
a = datetime(2020, 12, 12, 12, 12, 12)
b = datetime(2020, 11, 12, 12, 12, 11)
delta = a - b
value = ods_timedelta_value(delta)
eq_(value, "PT720H00M01S")
def test_ods_bool_to_string():
fixture = [True, False]
expected = ["true", "false"]
actual = [ods_bool_value(element) for element in fixture]
eq_(actual, expected)
def test_ods_time_value():
test = datetime(2020, 10, 6, 11, 11, 11)
actual = ods_time_value(test)
eq_(actual, "PT11H11M11S")
def test_ods_date_value():
test = datetime(2020, 10, 6, 11, 11, 11)
actual = ods_date_value(test)
eq_(actual, "2020-10-06")
def test_time_value_returns_time_delta():
test_time_value = "PT720H00M01S"
delta = time_value(test_time_value)
eq_(delta, timedelta(days=30, seconds=1))
def test_time_value():
test_time_value = "PT23H00M01S"
delta = time_value(test_time_value)
eq_(delta, time(23, 0, 1))