🤝 merge with dev

2018-08-22 18:52:46 +01:00 · 2018-08-22 18:52:46 +01:00 · 42c4fac89b
parent 7aec2d0407 c88163a67d
commit 42c4fac89b
14 changed files with 268 additions and 14 deletions
--- a/.moban.d/README.rst
+++ b/.moban.d/README.rst
@ -5,6 +5,11 @@

 {%block description%}
 **pyexcel-{{file_type}}** is a tiny wrapper library to read, manipulate and write data in {{file_type}} format and it can read xlsx and xlsm fromat. You are likely to use it with `pyexcel <https://github.com/pyexcel/pyexcel>`_.
+
+New flag: `detect_merged_cells` allows you to spread the same value among all merged cells. But be aware that this may slow down its reading performance.
+
+New flag: `skip_hidden_row_and_column` allows you to skip hidden rows and columns and is defaulted to **True**. It may slow down its reading performance. And it is only valid for 'xls' files. For 'xlsx' files, please use pyexcel-xlsx.
+
 {%endblock%}

 {%block extras %}
--- a/CHANGELOG.rst
+++ b/CHANGELOG.rst
@ -1,7 +1,44 @@
 Change log
 ================================================================================

-0.6.0 - unreleased
+0.5.7 - 15.03.2018
+--------------------------------------------------------------------------------
+
+Added
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. `pyexcel#54 <https://github.com/pyexcel/pyexcel/issues/54>`_, Book.datemode
+   attribute of that workbook should be passed always.
+
+0.5.6 - 15.03.2018
+--------------------------------------------------------------------------------
+
+Added
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. `pyexcel#120 <https://github.com/pyexcel/pyexcel/issues/120>`_, xlwt cannot
+   save a book without any sheet. So, let's raise an exception in this case in
+   order to warn the developers.
+
+0.5.5 - 8.11.2017
+--------------------------------------------------------------------------------
+
+Added
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. `#25 <https://github.com/pyexcel/pyexcel-xls/issues/25>`_, detect merged cell
+   in .xls
+
+0.5.4 - 2.11.2017
+--------------------------------------------------------------------------------
+
+Added
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. `#24 <https://github.com/pyexcel/pyexcel-xls/issues/24>`_, xlsx format cannot
+   use skip_hidden_row_and_column. please use pyexcel-xlsx instead.
+
+0.5.3 - 2.11.2017
 --------------------------------------------------------------------------------

 Added
@ -10,6 +47,27 @@ Added
 #. `#21 <https://github.com/pyexcel/pyexcel-xls/issues/21>`_, skip hidden rows
   and columns under 'skip_hidden_row_and_column' flag.

+0.5.2 - 23.10.2017
+--------------------------------------------------------------------------------
+
+updated
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. pyexcel `pyexcel#105 <https://github.com/pyexcel/pyexcel/issues/105>`_,
+   remove gease from setup_requires, introduced by 0.5.1.
+#. remove python2.6 test support
+#. update its dependecy on pyexcel-io to 0.5.3
+
+0.5.1 - 20.10.2017
+--------------------------------------------------------------------------------
+
+added
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#. `pyexcel#103 <https://github.com/pyexcel/pyexcel/issues/103>`_, include
+   LICENSE file in MANIFEST.in, meaning LICENSE file will appear in the released
+   tar ball.
+
 0.5.0 - 30.08.2017
 --------------------------------------------------------------------------------

--- a/README.rst
+++ b/README.rst
@ -20,6 +20,11 @@ pyexcel-xls - Let you focus on data, instead of xls format

 **pyexcel-xls** is a tiny wrapper library to read, manipulate and write data in xls format and it can read xlsx and xlsm fromat. You are likely to use it with `pyexcel <https://github.com/pyexcel/pyexcel>`_.

+New flag: `detect_merged_cells` allows you to spread the same value among all merged cells. But be aware that this may slow down its reading performance.
+
+New flag: `skip_hidden_row_and_column` allows you to skip hidden rows and columns and is defaulted to **True**. It may slow down its reading performance. And it is only valid for 'xls' files. For 'xlsx' files, please use pyexcel-xlsx.
+
+
 Known constraints
 ==================

--- a/changelog.yml
+++ b/changelog.yml
@ -1,12 +1,52 @@
 name: pyexcel-xls
 organisation: pyexcel
 releases:
+- changes:
+  - action: Added
+    details:
+    - "`pyexcel#54`, Book.datemode attribute of that workbook should be passed always."
+  date: 15.03.2018
+  version: 0.5.7
+- changes:
+  - action: Added
+    details:
+    - "`pyexcel#120`, xlwt cannot save a book without any sheet. So, let's raise an exception in this case in order to warn the developers."
+  date: 15.03.2018
+  version: 0.5.6
+- changes:
+  - action: Added
+    details:
+    - '`#25`, detect merged  cell in .xls'
+  date: 8.11.2017
+  version: 0.5.5
+- changes:
+  - action: Added
+    details:
+    - '`#24`, xlsx format cannot  use skip_hidden_row_and_column. please use pyexcel-xlsx
+      instead.'
+  date: 2.11.2017
+  version: 0.5.4
 - changes:
  - action: Added
    details:
    - '`#21`, skip hidden rows  and columns under ''skip_hidden_row_and_column'' flag.'
-  date: unreleased
-  version: 0.6.0
+  date: 2.11.2017
+  version: 0.5.3
+- changes:
+  - action: updated
+    details:
+    - pyexcel `pyexcel#105`, remove gease  from setup_requires, introduced by 0.5.1.
+    - remove python2.6 test support
+    - update its dependecy on pyexcel-io to 0.5.3
+  date: 23.10.2017
+  version: 0.5.2
+- changes:
+  - action: added
+    details:
+    - '`pyexcel#103`, include LICENSE file  in MANIFEST.in, meaning LICENSE file will
+      appear in the released tar ball.'
+  date: 20.10.2017
+  version: 0.5.1
 - changes:
  - action: Updated
    details:
--- a/pyexcel-xls.yml
+++ b/pyexcel-xls.yml
@ -6,7 +6,7 @@ current_version: 0.5.8
 release: 0.5.7
 file_type: xls
 dependencies:
-  - pyexcel-io>=0.5.0
+  - pyexcel-io>=0.5.3
  - xlrd
  - xlwt
 description: A wrapper library to read, manipulate and write data in xls format. It reads xlsx and xlsm format
--- a/pyexcel_xls/xlsr.py
+++ b/pyexcel_xls/xlsr.py
@ -12,7 +12,7 @@ import xlrd

 from pyexcel_io.book import BookReader
 from pyexcel_io.sheet import SheetReader
-from pyexcel_io._compact import OrderedDict
+from pyexcel_io._compact import OrderedDict, irange
 from pyexcel_io.service import has_no_digits_in_float


@ -24,17 +24,38 @@ XLS_KEYWORDS = [
 DEFAULT_ERROR_VALUE = '#N/A'


+class MergedCell(object):
+    def __init__(self, row_low, row_high, column_low, column_high):
+        self.__rl = row_low
+        self.__rh = row_high
+        self.__cl = column_low
+        self.__ch = column_high
+        self.value = None
+
+    def register_cells(self, registry):
+        for rowx in irange(self.__rl, self.__rh):
+            for colx in irange(self.__cl, self.__ch):
+                key = "%s-%s" % (rowx, colx)
+                registry[key] = self
+
+
 class XLSheet(SheetReader):
    """
    xls, xlsx, xlsm sheet reader

    Currently only support first sheet in the file
    """
-    def __init__(self, sheet, auto_detect_int=True, **keywords):
+    def __init__(self, sheet, auto_detect_int=True, date_mode=0, **keywords):
        SheetReader.__init__(self, sheet, **keywords)
        self.__auto_detect_int = auto_detect_int
        self.__hidden_cols = []
        self.__hidden_rows = []
+        self.__merged_cells = {}
+        self._book_date_mode = date_mode
+        if keywords.get('detect_merged_cells') is True:
+            for merged_cell_ranges in sheet.merged_cells:
+                merged_cells = MergedCell(*merged_cell_ranges)
+                merged_cells.register_cells(self.__merged_cells)
        if keywords.get('skip_hidden_row_and_column') is True:
            for col_index, info in self._native_sheet.colinfo_map.items():
                if info.hidden == 1:
@ -63,16 +84,26 @@ class XLSheet(SheetReader):
        """
        Random access to the xls cells
        """
-        row, column = self._offset_hidden_indices(row, column)
+        if self._keywords.get('skip_hidden_row_and_column') is True:
+            row, column = self._offset_hidden_indices(row, column)
        cell_type = self._native_sheet.cell_type(row, column)
        value = self._native_sheet.cell_value(row, column)
+
        if cell_type == xlrd.XL_CELL_DATE:
-            value = xldate_to_python_date(value)
+            value = xldate_to_python_date(value, self._book_date_mode)
        elif cell_type == xlrd.XL_CELL_NUMBER and self.__auto_detect_int:
            if has_no_digits_in_float(value):
                value = int(value)
        elif cell_type == xlrd.XL_CELL_ERROR:
            value = DEFAULT_ERROR_VALUE
+
+        if self.__merged_cells:
+            merged_cell = self.__merged_cells.get("%s-%s" % (row, column))
+            if merged_cell:
+                if merged_cell.value:
+                    value = merged_cell.value
+                else:
+                    merged_cell.value = value
        return value

    def _offset_hidden_indices(self, row, column):
@ -100,6 +131,7 @@ class XLSBook(BookReader):
        self._file_content = None
        self.__skip_hidden_sheets = True
        self.__skip_hidden_row_column = True
+        self.__detect_merged_cells = False

    def open(self, file_name, **keywords):
        self.__parse_keywords(**keywords)
@ -118,6 +150,7 @@ class XLSBook(BookReader):
        self.__skip_hidden_sheets = keywords.get('skip_hidden_sheets', True)
        self.__skip_hidden_row_column = keywords.get(
            'skip_hidden_row_and_column', True)
+        self.__detect_merged_cells = keywords.get('detect_merged_cells', False)

    def close(self):
        if self._native_book:
@ -148,7 +181,8 @@ class XLSBook(BookReader):
        return result

    def read_sheet(self, native_sheet):
-        sheet = XLSheet(native_sheet, **self._keywords)
+        sheet = XLSheet(native_sheet, date_mode=self._native_book.datemode,
+                        **self._keywords)
        return {sheet.name: sheet.to_array()}

    def _get_book(self, on_demand=False):
@ -164,7 +198,9 @@ class XLSBook(BookReader):
            xlrd_params['file_contents'] = self._file_content
        else:
            raise IOError("No valid file name or file content found.")
-        if self.__skip_hidden_row_column:
+        if self.__skip_hidden_row_column and self._file_type == 'xls':
+            xlrd_params['formatting_info'] = True
+        if self.__detect_merged_cells:
            xlrd_params['formatting_info'] = True
        xls_book = xlrd.open_workbook(**xlrd_params)
        return xls_book
@ -178,11 +214,12 @@ class XLSBook(BookReader):
        return params


-def xldate_to_python_date(value):
+def xldate_to_python_date(value, date_mode):
    """
    convert xl date to python date
    """
-    date_tuple = xlrd.xldate_as_tuple(value, 0)
+    date_tuple = xlrd.xldate_as_tuple(value, date_mode)
+
    ret = None
    if date_tuple == (0, 0, 0, 0, 0, 0):
        ret = datetime.datetime(1900, 1, 1, 0, 0, 0)
--- a/pyexcel_xls/xlsw.py
+++ b/pyexcel_xls/xlsw.py
@ -18,6 +18,7 @@ from pyexcel_io.sheet import SheetWriter
 DEFAULT_DATE_FORMAT = "DD/MM/YY"
 DEFAULT_TIME_FORMAT = "HH:MM:SS"
 DEFAULT_DATETIME_FORMAT = "%s %s" % (DEFAULT_DATE_FORMAT, DEFAULT_TIME_FORMAT)
+EMPTY_SHEET_NOT_ALLOWED = "xlwt does not support a book without any sheets"


 class XLSheetWriter(SheetWriter):
@ -76,6 +77,12 @@ class XLSWriter(BookWriter):
        self.work_book = Workbook(style_compression=style_compression,
                                  encoding=encoding)

+    def write(self, incoming_dict):
+        if incoming_dict:
+            BookWriter.write(self, incoming_dict)
+        else:
+            raise NotImplementedError(EMPTY_SHEET_NOT_ALLOWED)
+
    def create_sheet(self, name):
        return XLSheetWriter(self.work_book, None, name)

--- a/requirements.txt
+++ b/requirements.txt
@ -1,3 +1,3 @@
-pyexcel-io>=0.5.0
+pyexcel-io>=0.5.3
 xlrd
 xlwt
--- a/setup.py
+++ b/setup.py
@ -42,7 +42,7 @@ CLASSIFIERS = [
 ]

 INSTALL_REQUIRES = [
-    'pyexcel-io>=0.5.0',
+    'pyexcel-io>=0.5.3',
    'xlrd',
    'xlwt',
 ]
--- a/tests/fixtures/complex-merged-cells-sheet.xls
+++ b/tests/fixtures/complex-merged-cells-sheet.xls
--- a/tests/fixtures/merged-cell-sheet.xls
+++ b/tests/fixtures/merged-cell-sheet.xls
--- a/tests/fixtures/merged-sheet-exploration.xls
+++ b/tests/fixtures/merged-sheet-exploration.xls
--- a/tests/test_bug_fixes.py
+++ b/tests/test_bug_fixes.py
@ -7,6 +7,8 @@
 import os
 import pyexcel as pe
 from pyexcel_xls import save_data
+from pyexcel_xls.xlsr import xldate_to_python_date
+from pyexcel_xls.xlsw import XLSWriter as Writer
 from _compact import OrderedDict
 from nose.tools import eq_, raises
 from nose import SkipTest
@ -98,5 +100,20 @@ def test_issue_151():
    eq_('#N/A', s[0,0])


+@raises(NotImplementedError)
+def test_empty_book_pyexcel_issue_120():
+    """
+    https://github.com/pyexcel/pyexcel/issues/120
+    """
+    writer = Writer()
+    writer.write({})
+
+
+def test_pyexcel_issue_54():
+    xlvalue = 41071.0
+    date = xldate_to_python_date(xlvalue, 1)
+    eq_(date, datetime.date(2016, 6, 12))
+    
+
 def get_fixture(file_name):
    return os.path.join("tests", "fixtures", file_name)
--- a/tests/test_merged_cells.py
+++ b/tests/test_merged_cells.py
@ -0,0 +1,85 @@
+import os
+from pyexcel_xls import get_data
+from pyexcel_xls.xlsr import MergedCell
+from nose.tools import eq_
+
+
+def test_merged_cells():
+    data = get_data(
+        get_fixture("merged-cell-sheet.xls"),
+        detect_merged_cells=True,
+        library="pyexcel-xls")
+    expected = [[1, 2, 3], [1, 5, 6], [1, 8, 9], [10, 11, 11]]
+    eq_(data['Sheet1'], expected)
+
+
+def test_complex_merged_cells():
+    data = get_data(
+        get_fixture("complex-merged-cells-sheet.xls"),
+        detect_merged_cells=True,
+        library="pyexcel-xls")
+    expected = [
+        [1, 1, 2, 3, 15, 16, 22, 22, 24, 24],
+        [1, 1, 4, 5, 15, 17, 22, 22, 24, 24],
+        [6, 7, 8, 9, 15, 18, 22, 22, 24, 24],
+        [10, 11, 11, 12, 19, 19, 23, 23, 24, 24],
+        [13, 11, 11, 14, 20, 20, 23, 23, 24, 24],
+        [21, 21, 21, 21, 21, 21, 23, 23, 24, 24],
+        [25, 25, 25, 25, 25, 25, 25, 25, 25, 25],
+        [25, 25, 25, 25, 25, 25, 25, 25, 25, 25]
+    ]
+    eq_(data['Sheet1'], expected)
+
+
+def test_exploration():
+    data = get_data(
+        get_fixture("merged-sheet-exploration.xls"),
+        detect_merged_cells=True,
+        library="pyexcel-xls")
+    expected_sheet1 = [
+        [1, 1, 1, 1, 1, 1],
+        [2],
+        [2],
+        [2],
+        [2],
+        [2],
+        [2],
+        [2],
+        [2],
+        [2]]
+    eq_(data['Sheet1'], expected_sheet1)
+    expected_sheet2 = [
+        [3],
+        [3],
+        [3],
+        [3, 4, 4, 4, 4, 4, 4],
+        [3],
+        [3],
+        [3]]
+    eq_(data['Sheet2'], expected_sheet2)
+    expected_sheet3 = [
+        ['', '', '', '', '', 2, 2, 2],
+        [],
+        [],
+        [],
+        ['', '', '', 5],
+        ['', '', '', 5],
+        ['', '', '', 5],
+        ['', '', '', 5],
+        ['', '', '', 5]]
+    eq_(data['Sheet3'], expected_sheet3)
+
+
+def test_merged_cell_class():
+    test_dict = {}
+    merged_cell = MergedCell(1, 4, 1, 4)
+    merged_cell.register_cells(test_dict)
+    keys = sorted(list(test_dict.keys()))
+    expected = ['1-1', '1-2', '1-3', '2-1',
+                '2-2', '2-3', '3-1', '3-2', '3-3']
+    eq_(keys, expected)
+    eq_(merged_cell, test_dict['3-1'])
+
+
+def get_fixture(file_name):
+    return os.path.join("tests", "fixtures", file_name)