Bits bulbs (#92)

* 📚 update doc strings and plugin compactibility list

* 🔥 remove PY2 related code and update docs

* 🔬 more test coverage
This commit is contained in:
jaska 2020-10-06 22:13:56 +01:00 committed by GitHub
parent 9517e0bf49
commit a923bce3be
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
14 changed files with 146 additions and 91 deletions

View File

@ -57,6 +57,7 @@ get_data(.., library='pyexcel-ods')
============= ======= ======== ======= ======== ======== ======== ============= ======= ======== ======= ======== ======== ========
`pyexcel-io`_ `xls`_ `xlsx`_ `ods`_ `ods3`_ `odsr`_ `xlsxw`_ `pyexcel-io`_ `xls`_ `xlsx`_ `ods`_ `ods3`_ `odsr`_ `xlsxw`_
============= ======= ======== ======= ======== ======== ======== ============= ======= ======== ======= ======== ======== ========
0.6.0+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.10+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+ 0.5.10+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.1+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.1+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+
0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x
@ -91,7 +92,6 @@ get_data(.., library='pyexcel-ods')
csvz csvz
sqlalchemy sqlalchemy
django django
options
extensions extensions
@ -108,6 +108,7 @@ API
.. autosummary:: .. autosummary::
:toctree: api/ :toctree: api/
iget_data
get_data get_data
save_data save_data

View File

@ -1,5 +1,5 @@
pyexcel_io.get_data pyexcel\_io.get\_data
=================== =====================
.. currentmodule:: pyexcel_io .. currentmodule:: pyexcel_io

View File

@ -0,0 +1,6 @@
pyexcel\_io.iget\_data
======================
.. currentmodule:: pyexcel_io
.. autofunction:: iget_data

View File

@ -1,5 +1,5 @@
pyexcel_io.save_data pyexcel\_io.save\_data
==================== ======================
.. currentmodule:: pyexcel_io .. currentmodule:: pyexcel_io

View File

@ -2,9 +2,26 @@ Common parameters
================================================================================ ================================================================================
'library' option is added
--------------------------------------------------------------------------------
In order to have overlapping plugins co-exit, 'library' option is added to
get_data and save_data.
get_data only parameters
-------------------------------
keep_trailing_empty_cells
********************************************************************************
default: False
If turned on, the return data will contain trailing empty cells.
auto_dectect_datetime auto_dectect_datetime
-------------------------------------------------------------------------------- ********************************************************************************
The datetime formats are: The datetime formats are:
@ -14,11 +31,6 @@ The datetime formats are:
Any other datetime formats will be thrown as ValueError Any other datetime formats will be thrown as ValueError
'library' option is added
--------------------------------------------------------------------------------
In order to have overlapping plugins co-exit, 'library' option is added to
get_data and save_data.
csv only parameters csv only parameters
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------

View File

@ -1,8 +1,12 @@
Extend pyexcel-io Tutorial Extend pyexcel-io Tutorial
================================================================================ ================================================================================
You are welcome extend pyexcel-io to read and write more tabular formats. In You are welcome toextend pyexcel-io to read and write more tabular formats.
github repo, you will find two examples in `examples` folder. This section No. 1 rule, your plugin must have a prefix 'pyexcel_' in its module path.
For example, `pyexcel-xls` has 'pyexcel_xls' as its module path. Otherwise,
pyexcel-io will not load your plugin.
On github, you will find two examples in `examples` folder. This section
explains its implementations to help you write yours. explains its implementations to help you write yours.
.. note:: .. note::
@ -10,7 +14,7 @@ explains its implementations to help you write yours.
No longer, you will need to do explicit imports for pyexcel-io extensions. No longer, you will need to do explicit imports for pyexcel-io extensions.
Instead, you install them and manage them via pip. Instead, you install them and manage them via pip.
Reader Simple Reader for a yaml file
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
Suppose we have a yaml file, containing a dictionary where the values are Suppose we have a yaml file, containing a dictionary where the values are
@ -60,7 +64,7 @@ files on physical disk. "memory" means a file stream. "content" means a string b
:lines: 36-41 :lines: 36-41
**Test your reader ** **Test your reader**
Let's run the following code and see if it works. Let's run the following code and see if it works.
@ -68,13 +72,21 @@ Let's run the following code and see if it works.
:language: python :language: python
:lines: 43-45 :lines: 43-45
Writer
You would see these in standard output:
.. code-block:: bash
$ python custom_yaml_reader.py
OrderedDict([('sheet 1', [[1, 2, 3], [2, 3, 4]]), ('sheet 2', [['A', 'B', 'C']])])
A writer to write content in yaml
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
Now for the writer, let's write a pyexcel-io writer that write a dictionary of Now for the writer, let's write a pyexcel-io writer that write a dictionary of
two dimentaional arrays back into a yaml file seen above. two dimentaional arrays back into a yaml file seen above.
** Implement IWriter ** **Implement IWriter**
Two abstract functions are required: Two abstract functions are required:
@ -85,7 +97,7 @@ Two abstract functions are required:
:language: python :language: python
:lines: 18-30 :lines: 18-30
** Implement ISheetWriter ** **Implement ISheetWriter**
It is imagined that you will have your own sheet writer. You simply need to figure It is imagined that you will have your own sheet writer. You simply need to figure
out how to write a row. Row by row write action was already written by `ISheetWrier`. out how to write a row. Row by row write action was already written by `ISheetWrier`.
@ -111,8 +123,25 @@ Let's run the following code and please examine `mytest.yaml` yourself.
:language: python :language: python
:lines: 40-46 :lines: 40-46
And you shall find a file named 'mytest.yaml':
.. code-block:: bash
$ cat mytest.yaml
sheet 1:
- - 1
- 3
- 4
- - 2
- 4
- 9
sheet 2:
- - B
- C
- D
Other pyexcel-io plugins Other pyexcel-io plugins
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
@ -138,26 +167,6 @@ And you can also get the data back::
[[1, 2, 3]] [[1, 2, 3]]
Work with memory file
-----------------------------------------------------------------------------
Here is the sample code to work with memory file::
>>> from pyexcel_io.manager import get_io
>>> io = get_io("xls")
>>> data = [[1,2,3]]
>>> save_data(io, data, "xls")
The difference is that you have mention file type if you use :meth:`pyexcel_io.save_data`
And you can also get the data back::
>>> data = get_data(io, "xls")
>>> data['pyexcel_sheet1']
[[1, 2, 3]]
The same applies to :meth:`pyexcel_io.get_data`.
Other formats Other formats
----------------------------------------------------------------------------- -----------------------------------------------------------------------------

View File

@ -159,6 +159,7 @@ get_data(.., library='pyexcel-ods')
============= ======= ======== ======= ======== ======== ======== ============= ======= ======== ======= ======== ======== ========
`pyexcel-io`_ `xls`_ `xlsx`_ `ods`_ `ods3`_ `odsr`_ `xlsxw`_ `pyexcel-io`_ `xls`_ `xlsx`_ `ods`_ `ods3`_ `odsr`_ `xlsxw`_
============= ======= ======== ======= ======== ======== ======== ============= ======= ======== ======= ======== ======== ========
0.6.0+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.10+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+ 0.5.10+ 0.5.0+ 0.5.0+ 0.5.4 0.5.3 0.5.0+ 0.5.0+
0.5.1+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.1+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+ 0.5.0+
0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x 0.4.x
@ -193,7 +194,6 @@ get_data(.., library='pyexcel-ods')
csvz csvz
sqlalchemy sqlalchemy
django django
options
extensions extensions
@ -210,6 +210,7 @@ API
.. autosummary:: .. autosummary::
:toctree: api/ :toctree: api/
iget_data
get_data get_data
save_data save_data

View File

@ -1,11 +0,0 @@
Options
======================
Here is the documentation on the keyword options for get_data.
keep_trailing_empty_cells
------------------------------
default: False
If turned on, the return data will contain trailing empty cells.

View File

@ -1,5 +1,4 @@
Rendering(Formatting) the data Rendering(Formatting) the data
================================================================================ ================================================================================
You might want to do custom rendering on your data obtained. `row_renderer` was You might want to do custom rendering on your data obtained. `row_renderer` was

View File

@ -7,16 +7,9 @@
:copyright: (c) 2014-2020 by Onni Software Ltd. :copyright: (c) 2014-2020 by Onni Software Ltd.
:license: New BSD License, see LICENSE for more details :license: New BSD License, see LICENSE for more details
""" """
# flake8: noqa
# pylint: disable=import-error
# pylint: disable=invalid-name
# pylint: disable=too-few-public-methods
# pylint: disable=ungrouped-imports
# pylint: disable=redefined-variable-type
import sys
import types
import logging import logging
from collections import OrderedDict from io import BytesIO, StringIO # noqa: F401
from collections import OrderedDict # noqa: F401
try: try:
from logging import NullHandler from logging import NullHandler
@ -27,8 +20,6 @@ except ImportError:
pass pass
from io import BytesIO, StringIO
text_type = str text_type = str
irange = range irange = range
@ -48,7 +39,4 @@ def isstream(instance):
def is_string(atype): def is_string(atype):
"""find out if a type is str or not""" """find out if a type is str or not"""
if atype == str: return atype == str
return True
return False

View File

@ -25,6 +25,10 @@ from pyexcel_io.exceptions import (
def iget_data(afile, file_type=None, **keywords): def iget_data(afile, file_type=None, **keywords):
"""Get data from an excel file source """Get data from an excel file source
The data has not gone into memory yet. If you use dedicated partial read
plugins, such as pyexcel-xlsxr, pyexcel-odsr, you will notice
the memory consumption drop when you work with big files.
:param afile: a file name, a file stream or actual content :param afile: a file name, a file stream or actual content
:param sheet_name: the name of the sheet to be loaded :param sheet_name: the name of the sheet to be loaded
:param sheet_index: the index of the sheet to be loaded :param sheet_index: the index of the sheet to be loaded
@ -32,9 +36,6 @@ def iget_data(afile, file_type=None, **keywords):
:param file_type: used only when filename is not a physical file name :param file_type: used only when filename is not a physical file name
:param force_file_type: used only when filename refers to a physical file :param force_file_type: used only when filename refers to a physical file
and it is intended to open it as forced file type. and it is intended to open it as forced file type.
:param streaming: toggles the type of returned data. The values of the
returned dictionary remain as generator if it is set
to True. Default is False.
:param library: explicitly name a library for use. :param library: explicitly name a library for use.
e.g. library='pyexcel-ods' e.g. library='pyexcel-ods'
:param auto_detect_float: defaults to True :param auto_detect_float: defaults to True
@ -44,6 +45,7 @@ def iget_data(afile, file_type=None, **keywords):
:param ignore_nan_text: various forms of 'NaN', 'nan' are ignored :param ignore_nan_text: various forms of 'NaN', 'nan' are ignored
:param default_float_nan: choose one form of 'NaN', 'nan' :param default_float_nan: choose one form of 'NaN', 'nan'
:param pep_0515_off: turn off pep 0515. default to True. :param pep_0515_off: turn off pep 0515. default to True.
:param keep_trailing_empty_cells: keep trailing columns. default to False
:param keywords: any other library specific parameters :param keywords: any other library specific parameters
:returns: an ordered dictionary :returns: an ordered dictionary
""" """
@ -59,7 +61,10 @@ def get_data(afile, file_type=None, streaming=None, **keywords):
:param afile: a file name, a file stream or actual content :param afile: a file name, a file stream or actual content
:param sheet_name: the name of the sheet to be loaded :param sheet_name: the name of the sheet to be loaded
:param sheet_index: the index of the sheet to be loaded :param sheet_index: the index of the sheet to be loaded
:param sheets: a list of sheet to be loaded
:param file_type: used only when filename is not a physial file name :param file_type: used only when filename is not a physial file name
:param force_file_type: used only when filename refers to a physical file
and it is intended to open it as forced file type.
:param streaming: toggles the type of returned data. The values of the :param streaming: toggles the type of returned data. The values of the
returned dictionary remain as generator if it is set returned dictionary remain as generator if it is set
to True. Default is False. to True. Default is False.
@ -69,6 +74,10 @@ def get_data(afile, file_type=None, streaming=None, **keywords):
:param auto_detect_int: defaults to True :param auto_detect_int: defaults to True
:param auto_detect_datetime: defaults to True :param auto_detect_datetime: defaults to True
:param ignore_infinity: defaults to True :param ignore_infinity: defaults to True
:param ignore_nan_text: various forms of 'NaN', 'nan' are ignored
:param default_float_nan: choose one form of 'NaN', 'nan'
:param pep_0515_off: turn off pep 0515. default to True.
:param keep_trailing_empty_cells: keep trailing columns. default to False
:param keywords: any other library specific parameters :param keywords: any other library specific parameters
:returns: an ordered dictionary :returns: an ordered dictionary
""" """

View File

@ -1,5 +1,4 @@
import os import os
import sys
import types import types
from zipfile import BadZipfile from zipfile import BadZipfile
from unittest import TestCase from unittest import TestCase
@ -12,8 +11,6 @@ from pyexcel_io._compact import BytesIO, StringIO, OrderedDict, is_string
from nose.tools import eq_, raises from nose.tools import eq_, raises
PY2 = sys.version_info[0] == 2
@raises(IOError) @raises(IOError)
def test_directory_name_as_file(): def test_directory_name_as_file():
@ -116,11 +113,8 @@ def test_load_unknown_data_from_memory():
@raises(BadZipfile) @raises(BadZipfile)
def test_load_csvz_data_from_memory(): def test_load_csvz_data_from_memory():
if not PY2: io = StringIO()
io = StringIO() get_data(io, file_type="csvz")
get_data(io, file_type="csvz")
else:
raise BadZipfile("pass it")
@raises(IOError) @raises(IOError)
@ -130,12 +124,9 @@ def test_write_xlsx_data():
@raises(Exception) @raises(Exception)
def test_writer_csvz_data_from_memory(): def test_writer_csvz_data_from_memory():
if not PY2: io = StringIO()
io = StringIO() writer = get_writer(io, file_type="csvz")
writer = get_writer(io, file_type="csvz") writer.write({"adb": [[2, 3]]})
writer.write({"adb": [[2, 3]]})
else:
raise Exception("pass it")
@raises(exceptions.NoSupportingPluginFound) @raises(exceptions.NoSupportingPluginFound)
@ -264,10 +255,7 @@ def test_conversion_from_bytes_to_text():
def test_is_string(): def test_is_string():
if PY2: assert is_string(type("a")) is True
assert is_string(type(u"a")) is True
else:
assert is_string(type("a")) is True
def test_generator_is_obtained(): def test_generator_is_obtained():

View File

@ -1,6 +1,5 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
import os import os
import sys
import zipfile import zipfile
from unittest import TestCase from unittest import TestCase
@ -12,8 +11,6 @@ from pyexcel_io._compact import OrderedDict
from nose.tools import raises from nose.tools import raises
PY2 = sys.version_info[0] == 2
class TestCSVZ(TestCase): class TestCSVZ(TestCase):
file_type = "csvz" file_type = "csvz"

View File

@ -1,10 +1,17 @@
from datetime import time, datetime, timedelta
from pyexcel_io.service import ( from pyexcel_io.service import (
date_value, date_value,
time_value, time_value,
boolean_value,
ods_bool_value,
ods_date_value,
ods_time_value,
ods_float_value, ods_float_value,
throw_exception, throw_exception,
detect_int_value, detect_int_value,
detect_float_value, detect_float_value,
ods_timedelta_value,
) )
from pyexcel_io.exceptions import IntegerAccuracyLossError from pyexcel_io.exceptions import IntegerAccuracyLossError
@ -106,3 +113,52 @@ def test_big_int_value():
@raises(IntegerAccuracyLossError) @raises(IntegerAccuracyLossError)
def test_throw_exception(): def test_throw_exception():
throw_exception(1000000000000000) throw_exception(1000000000000000)
def test_boolean_value():
fixture = ["true", "false", 1]
expected = [True, False, 1]
actual = [boolean_value(element) for element in fixture]
eq_(actual, expected)
def test_time_delta_presentation():
a = datetime(2020, 12, 12, 12, 12, 12)
b = datetime(2020, 11, 12, 12, 12, 11)
delta = a - b
value = ods_timedelta_value(delta)
eq_(value, "PT720H00M01S")
def test_ods_bool_to_string():
fixture = [True, False]
expected = ["true", "false"]
actual = [ods_bool_value(element) for element in fixture]
eq_(actual, expected)
def test_ods_time_value():
test = datetime(2020, 10, 6, 11, 11, 11)
actual = ods_time_value(test)
eq_(actual, "PT11H11M11S")
def test_ods_date_value():
test = datetime(2020, 10, 6, 11, 11, 11)
actual = ods_date_value(test)
eq_(actual, "2020-10-06")
def test_time_value_returns_time_delta():
test_time_value = "PT720H00M01S"
delta = time_value(test_time_value)
eq_(delta, timedelta(days=30, seconds=1))
def test_time_value():
test_time_value = "PT23H00M01S"
delta = time_value(test_time_value)
eq_(delta, time(23, 0, 1))