bi-joe: BI engine and frontend for PostgreSQL
This commit is contained in:
parent
45538f5d2f
commit
9314d9b9c6
|
@ -1,2 +1,5 @@
|
|||
include VERSION
|
||||
include tox.ini
|
||||
recursive-include tests *.py
|
||||
include create_dates.sql
|
||||
recursive-include bijoe/templates *.html
|
||||
|
|
248
README.rst
248
README.rst
|
@ -2,22 +2,252 @@ BI for Publik
|
|||
=============
|
||||
|
||||
w.c.s. OLAP
|
||||
-----------
|
||||
~~~~~~~~~~~
|
||||
|
||||
Tool to export w.c.s. data in a database with star schema for making an OLAP
|
||||
cube.::
|
||||
cube.
|
||||
|
||||
usage: wcs-olap --url URL [-h] --orig ORIG --key KEY
|
||||
--pg-dsn PG_DSN
|
||||
::
|
||||
|
||||
usage: wcs-olap [--no-feed] [-a | --url URL] [-h] [--orig ORIG] [--key KEY]
|
||||
[--pg-dsn PG_DSN] [--schema SCHEMA]
|
||||
[config_path]
|
||||
|
||||
Export W.C.S. data as a star schema in a postgresql DB
|
||||
|
||||
positional arguments:
|
||||
config_path
|
||||
|
||||
optional arguments:
|
||||
--url URL url of the w.c.s. instance
|
||||
-h, --help show this help message and exit
|
||||
--orig ORIG origin of the request for signatures
|
||||
--key KEY HMAC key for signatures
|
||||
--pg-dsn PG_DSN Psycopg2 DB DSN
|
||||
--no-feed only produce the model
|
||||
-a, --all synchronize all wcs
|
||||
--url URL url of the w.c.s. instance
|
||||
-h, --help show this help message and exit
|
||||
--orig ORIG origin of the request for signatures
|
||||
--key KEY HMAC key for signatures
|
||||
--pg-dsn PG_DSN Psycopg2 DB DSN
|
||||
--schema SCHEMA schema name
|
||||
|
||||
Bi-Joe
|
||||
~~~~~~
|
||||
|
||||
BI Joe is a library and a Django application to simplify querying a postgresql
|
||||
database containing a star schema with BI annotations in order to
|
||||
easily produce BI dashboards
|
||||
|
||||
It's inspired by the Cubes project.
|
||||
|
||||
Feature
|
||||
~~~~~~~
|
||||
|
||||
* use PostgreSQL database as datastore,
|
||||
* declare joins to define star schemas of your cubes
|
||||
* declare dimensions as SQL query or exepression defining, label, group by,
|
||||
ordering or member list.
|
||||
* declare measure as SQL expression using aggregate functions
|
||||
|
||||
Missing features
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
* hierarchical dimensions
|
||||
* measure necessiting a rollup (percentage based on count() of facts at an upper
|
||||
level)
|
||||
|
||||
Model
|
||||
~~~~~
|
||||
|
||||
You declare your model using JSON files, those JSON files are targeted by a list
|
||||
of glob patterns in the Djagno setting BIJOE_SCHEMAS.
|
||||
|
||||
.. code:: python
|
||||
|
||||
BIJOE_SCHEMAS = ['/var/lib/bijoe/cubes/*.model']
|
||||
|
||||
The JSON model files must conform to this schema:
|
||||
|
||||
* name: technical identifier of the mode, a short string without space if
|
||||
preferred
|
||||
|
||||
* label: any string describing the model
|
||||
|
||||
* pg_dsn: string describing the connection to PostgreSQL, as expected by
|
||||
psycopg2, ex.: `"dbname=olap_db user=admin password=xxx"`
|
||||
|
||||
* search_path: search path to set set if relations are not all in the public
|
||||
schema,
|
||||
|
||||
* cubes: the list of cube descriptors,
|
||||
|
||||
* name: technical identifier of the cube, same remark thant for models,
|
||||
|
||||
* label: as for model,
|
||||
|
||||
* fact_table: name of the table storing facts
|
||||
|
||||
* key: column of the table identifying individual facts
|
||||
|
||||
* joins: list of equality joins, all joins are RIGHT OUTER JOINS, table are
|
||||
cross joined when drill involve dimensions using multiple joins.
|
||||
|
||||
* name: SQL identifier for naming the join,
|
||||
|
||||
* table: name of the relation being joined,
|
||||
|
||||
* master: table and column indicating the left part of the equality
|
||||
condition for the join, you can use `mydim_id` to reference the fact
|
||||
table or `otherjoin.mydim_id` to reference another join.
|
||||
|
||||
* detail: name of the column on the joined table for the equality condition,
|
||||
|
||||
* kind: type of join, must be `inner`, `left` or `right`, default is right.
|
||||
|
||||
* dimensions: list of dimension descriptors,
|
||||
|
||||
* name: technical identifier of the dimension, it will be used to name the
|
||||
dimension in the API,
|
||||
|
||||
* label: human description for the dimension, used in UIs,
|
||||
|
||||
* join: list of join names, indicate that some joins must be used when using
|
||||
this dimension,
|
||||
|
||||
* type: indicate the type of the dimension, numerical, time-like,
|
||||
geographical, duration, etc..
|
||||
|
||||
* value: SQL expression giving the value for the dimension,
|
||||
it can be different than the value used for filtering or grouping,
|
||||
* sql_filter: SQL expression that will be used in the SQL template
|
||||
`<sql_filter> IN (...)` when filtering along a dimension,
|
||||
|
||||
* value_label: SQL expression giving the shown value
|
||||
|
||||
* group_by: SQL expression to group facts along a dimension, default is to
|
||||
use value
|
||||
|
||||
* order_by: SQL expression to order dimension values, default is to use
|
||||
value
|
||||
|
||||
* measures: list of measure descriptors
|
||||
|
||||
* name: as for models,
|
||||
|
||||
* label: as for models,
|
||||
|
||||
* type: type of the measure: integer or duration,
|
||||
|
||||
* expression: SQL expression indicating how to compute the aggregate,
|
||||
ex.: `avg(delay)`, `count(product_type)`.
|
||||
|
||||
Example
|
||||
+++++++
|
||||
|
||||
.. code:: json
|
||||
|
||||
{
|
||||
"name" : "cam",
|
||||
"label" : "cam",
|
||||
"pg_dsn" : "dbname=wcs-olap",
|
||||
"search_path" : [
|
||||
"cam",
|
||||
"public"
|
||||
],
|
||||
"cubes" : [
|
||||
{
|
||||
"name" : "all_formdata",
|
||||
"label" : "Tous les formulaires",
|
||||
"fact_table" : "formdata",
|
||||
"key" : "id",
|
||||
"joins" : [
|
||||
{
|
||||
"name" : "formdef",
|
||||
"master" : "formdef_id",
|
||||
"detail" : "id",
|
||||
"table" : "formdef"
|
||||
},
|
||||
{
|
||||
"name" : "dates",
|
||||
"master" : "receipt_time",
|
||||
"detail" : "date",
|
||||
"table" : "dates"
|
||||
}
|
||||
],
|
||||
"dimensions" : [
|
||||
{
|
||||
"name" : "formdef",
|
||||
"label" : "formulaire",
|
||||
"type" : "integer",
|
||||
"join" : ["formdef"],
|
||||
"value" : "formdef.id",
|
||||
"value_label" : "formdef.label"
|
||||
},
|
||||
{
|
||||
"join": [
|
||||
"receipt_time"
|
||||
],
|
||||
"label": "date de la demande",
|
||||
"name": "receipt_time",
|
||||
"type": "date",
|
||||
"value": "receipt_time.date"
|
||||
},
|
||||
],
|
||||
"measures" : [
|
||||
{
|
||||
"name": "count",
|
||||
"label": "nombre de demandes",
|
||||
"type": "integer",
|
||||
"expression": "count({fact_table}.id)",
|
||||
},
|
||||
{
|
||||
"name" : "avg_endpoint_delay",
|
||||
"label" : "Délai de traitement",
|
||||
"type" : "duration",
|
||||
"expression" : "avg(endpoint_delay)"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
API
|
||||
~~~
|
||||
|
||||
Model description is handled by `bijoe.schema` and model querying by
|
||||
`bijoe.engine`.
|
||||
|
||||
bijoe.schema.Warehouse
|
||||
++++++++++++++++++++++
|
||||
|
||||
`Warehouse` is the main class to manipulate models, it has two class methods:
|
||||
|
||||
* `from_json(d)` which transform a dictionnary obtained by eventually parsing
|
||||
some JSON file to a Warehouse object.
|
||||
|
||||
* `to_json()` which transform a Warehouse object into a JSON compatible
|
||||
dictionnary.
|
||||
|
||||
bijoe.engine.Engine
|
||||
+++++++++++++++++++
|
||||
|
||||
`Engine(warehouse)` is the entry-point for querying a model, you get an
|
||||
`EngineCube` object by indexing the engine with the name of a cube.
|
||||
|
||||
.. code:: python
|
||||
|
||||
cube = Engine(warehouse)['mycube']
|
||||
|
||||
|
||||
You can query your cube using the `query` method.
|
||||
|
||||
.. code:: python
|
||||
|
||||
cube.query(filters=[('year', [2013, 2014, 2015])],
|
||||
drilldown=['year', 'product'],
|
||||
measures=['count', 'avg_sale'])
|
||||
|
||||
It returns a sequence of rows whose elements are the values of the drilldown
|
||||
dimensions in the same order as in the query followed by the values of the
|
||||
measures also in the same ordre as the query.
|
||||
|
||||
The `count` measure is a special measure which is always present whose
|
||||
expression is always `count({fact_table}.{key})` where `fact_table` and `key`
|
||||
are the corresponding attributes of the cube.
|
||||
|
|
6
setup.py
6
setup.py
|
@ -42,7 +42,7 @@ def get_version():
|
|||
return '0.0.0'
|
||||
|
||||
|
||||
setup(name="wcs-olap",
|
||||
setup(name="publik-bi",
|
||||
version=get_version(),
|
||||
license="AGPLv3+",
|
||||
description="Export w.c.s. data to an OLAP cube",
|
||||
|
@ -54,8 +54,10 @@ setup(name="wcs-olap",
|
|||
maintainer_email="bdauvergne@entrouvert.com",
|
||||
packages=find_packages(),
|
||||
include_package_data=True,
|
||||
install_requires=['requests','psycopg2', 'isodate'],
|
||||
install_requires=['requests', 'django', 'psycopg2', 'isodate', 'Django-Select2',
|
||||
'XStatic-ChartNew.js'],
|
||||
entry_points={
|
||||
'console_scripts': ['wcs-olap=wcs_olap.cmd:main'],
|
||||
},
|
||||
scripts=['bijoe-ctl'],
|
||||
cmdclass={'sdist': eo_sdist})
|
||||
|
|
|
@ -0,0 +1,65 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
from bijoe.schemas import Warehouse
|
||||
|
||||
|
||||
def test_simple_parsing():
|
||||
Warehouse.from_json({
|
||||
'name': 'coin',
|
||||
'label': 'coin',
|
||||
'pg_dsn': 'dbname=zozo',
|
||||
'search_path': ['cam', 'public'],
|
||||
'cubes': [
|
||||
{
|
||||
'name': 'all_formdata',
|
||||
'label': 'Tous les formulaires',
|
||||
'fact_table': 'formdata',
|
||||
'key': 'id',
|
||||
'joins': [
|
||||
{
|
||||
'name': 'formdef',
|
||||
'master': '{fact_table}.formdef_id',
|
||||
'table': 'formdef',
|
||||
'detail': 'formdef.id',
|
||||
}
|
||||
],
|
||||
'dimensions': [
|
||||
{
|
||||
'label': 'formulaire',
|
||||
'name': 'formdef',
|
||||
'type': 'integer',
|
||||
'join': ['formdef'],
|
||||
'value': 'formdef.id',
|
||||
'value_label': 'formdef.label',
|
||||
'order_by': 'formdef.label'
|
||||
},
|
||||
{
|
||||
'name': 'receipt_time',
|
||||
'label': 'date de soumission',
|
||||
'join': ['receipt_time'],
|
||||
'type': 'date',
|
||||
'value': 'receipt_time.date'
|
||||
}
|
||||
],
|
||||
'measures': [
|
||||
{
|
||||
'type': 'integer',
|
||||
'label': 'Nombre de demandes',
|
||||
'expression': 'count({fact_table}.id)',
|
||||
'name': 'count'
|
||||
},
|
||||
{
|
||||
'type': 'integer',
|
||||
'label': u'Délai de traitement',
|
||||
'expression': 'avg((to_char(endpoint_delay, \'9999.999\') || \' days\')::interval)',
|
||||
'name': 'avg_endpoint_delay'
|
||||
},
|
||||
{
|
||||
'type': 'percent',
|
||||
'label': 'Pourcentage',
|
||||
'expression': 'count({fact_table}.id) * 100. / (select count({fact_table}.id) from {table_expression} where {where_conditions})',
|
||||
'name': 'percentage'
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
})
|
|
@ -0,0 +1,19 @@
|
|||
# Tox (http://tox.testrun.org/) is a tool for running tests
|
||||
# in multiple virtualenvs. This configuration file will run the
|
||||
# test suite on all supported python versions. To use it, "pip install tox"
|
||||
# and then run "tox" from this directory.
|
||||
|
||||
[tox]
|
||||
toxworkdir = {env:TMPDIR:/tmp}/tox-{env:USER}/publik-bi/
|
||||
|
||||
[testenv]
|
||||
usedevelop = true
|
||||
setenv =
|
||||
coverage: COVERAGE=--junit-xml=junit.xml --cov=src --cov-report xml
|
||||
deps =
|
||||
coverage
|
||||
pytest
|
||||
pytest-cov
|
||||
pytest-random
|
||||
commands =
|
||||
py.test {env:COVERAGE:} {posargs:--random tests}
|
|
@ -43,10 +43,13 @@ def main2():
|
|||
'postgresql DB', add_help=False)
|
||||
parser.add_argument('config_path', nargs='?', default=None)
|
||||
group = parser.add_mutually_exclusive_group()
|
||||
parser.add_argument('--no-feed', dest='feed', help='only produce the model',
|
||||
action='store_false', default=True)
|
||||
group.add_argument("-a", "--all", help="synchronize all wcs", action='store_true',
|
||||
default=False)
|
||||
group.add_argument('--url', help='url of the w.c.s. instance', required=False, default=None)
|
||||
args, rest = parser.parse_known_args()
|
||||
feed = args.feed
|
||||
config = get_config(path=args.config_path)
|
||||
# list all known urls
|
||||
urls = [url for url in config.sections() if url.startswith('http://') or
|
||||
|
@ -90,7 +93,7 @@ def main2():
|
|||
verify=defaults.get('verify', 'True') == 'True')
|
||||
logger.info('starting synchronizing w.c.s. at %r with PostgreSQL at %s', url, pg_dsn)
|
||||
feeder = WcsOlapFeeder(api=api, schema=schema, pg_dsn=pg_dsn, logger=logger,
|
||||
config=defaults)
|
||||
config=defaults, do_feed=feed)
|
||||
feeder.feed()
|
||||
logger.info('finished')
|
||||
defaults = {}
|
||||
|
|
|
@ -43,7 +43,7 @@ class Context(object):
|
|||
|
||||
|
||||
class WcsOlapFeeder(object):
|
||||
def __init__(self, api, pg_dsn, schema, logger=None, config=None):
|
||||
def __init__(self, api, pg_dsn, schema, logger=None, config=None, do_feed=True):
|
||||
self.api = api
|
||||
self.logger = logger or Whatever()
|
||||
self.schema = schema
|
||||
|
@ -53,6 +53,7 @@ class WcsOlapFeeder(object):
|
|||
self.formdefs = api.formdefs
|
||||
self.roles = api.roles
|
||||
self.categories = api.categories
|
||||
self.do_feed = do_feed
|
||||
self.ctx = Context()
|
||||
self.ctx.push({
|
||||
'schema': self.schema,
|
||||
|
@ -72,275 +73,139 @@ class WcsOlapFeeder(object):
|
|||
self.model = {
|
||||
'label': self.config.get('cubes_label', schema),
|
||||
'name': schema,
|
||||
'browser_options': {
|
||||
'schema': schema,
|
||||
},
|
||||
'search_path': [schema, 'public'],
|
||||
'pg_dsn': pg_dsn,
|
||||
'cubes': [],
|
||||
}
|
||||
cube = {
|
||||
'name': 'all_formdata',
|
||||
'label': u'Tous les formulaires',
|
||||
'fact_table': 'formdata',
|
||||
'key': 'id',
|
||||
'joins': [
|
||||
{
|
||||
'name': 'receipt_time',
|
||||
'table': 'dates',
|
||||
'detail': 'date',
|
||||
'master': 'receipt_time',
|
||||
'detail': {
|
||||
'table': 'dates',
|
||||
'column': 'date',
|
||||
'schema': 'public',
|
||||
},
|
||||
'method': 'detail',
|
||||
'alias': 'dates',
|
||||
},
|
||||
{
|
||||
'name': 'channel',
|
||||
'table': 'channel',
|
||||
'master': 'channel_id',
|
||||
'detail': '{channel_table}.id',
|
||||
'method': 'detail',
|
||||
},
|
||||
{
|
||||
'name': 'role',
|
||||
'detail': '{role_table}.id',
|
||||
'method': 'detail',
|
||||
'detail': 'id',
|
||||
},
|
||||
{
|
||||
'name': 'formdef',
|
||||
'table': 'formdef',
|
||||
'master': 'formdef_id',
|
||||
'detail': '{form_table}.id',
|
||||
'method': 'detail',
|
||||
'detail': 'id',
|
||||
},
|
||||
{
|
||||
'name': 'category',
|
||||
'master': '{form_table}.category_id',
|
||||
'detail': '{category_table}.id',
|
||||
'table': 'category',
|
||||
'master': 'formdef.category_id',
|
||||
'detail': 'id',
|
||||
'kind': 'left',
|
||||
},
|
||||
{
|
||||
'name': 'hour',
|
||||
'table': 'hour',
|
||||
'master': 'hour_id',
|
||||
'detail': '{hour_table}.id',
|
||||
'method': 'detail',
|
||||
'detail': 'id',
|
||||
},
|
||||
{
|
||||
'name': 'generic_status',
|
||||
'table': 'status',
|
||||
'master': 'generic_status_id',
|
||||
'detail': '{generic_status_table}.id',
|
||||
'method': 'detail',
|
||||
'detail': 'id',
|
||||
},
|
||||
],
|
||||
'dimensions': [
|
||||
{
|
||||
'label': 'date de soumission',
|
||||
'name': 'receipt_time',
|
||||
'role': 'time',
|
||||
'levels': [
|
||||
{
|
||||
'name': 'year',
|
||||
'label': 'année',
|
||||
'role': 'year',
|
||||
'order_attribute': 'year',
|
||||
'order': 'asc',
|
||||
},
|
||||
{
|
||||
'name': 'quarter',
|
||||
'order_attribute': 'quarter',
|
||||
'label': 'trimestre',
|
||||
'role': 'quarter',
|
||||
},
|
||||
{
|
||||
'name': 'month',
|
||||
'label': 'mois',
|
||||
'role': 'month',
|
||||
'attributes': ['month', 'month_name'],
|
||||
'order_attribute': 'month',
|
||||
'label_attribute': 'month_name',
|
||||
'order': 'asc',
|
||||
},
|
||||
{
|
||||
'name': 'week',
|
||||
'label': 'semaine',
|
||||
'role': 'week',
|
||||
},
|
||||
{
|
||||
'name': 'day',
|
||||
'label': 'jour',
|
||||
'role': 'day',
|
||||
'order': 'asc',
|
||||
},
|
||||
{
|
||||
'name': 'dow',
|
||||
'label': 'jour de la semaine',
|
||||
'attributes': ['dow', 'dow_name'],
|
||||
'order_attribute': 'dow',
|
||||
'label_attribute': 'dow_name',
|
||||
'order': 'asc',
|
||||
},
|
||||
],
|
||||
'hierarchies': [
|
||||
{
|
||||
'name': 'default',
|
||||
'label': 'par défaut',
|
||||
'levels': ['year', 'month', 'day']
|
||||
},
|
||||
{
|
||||
'name': 'quarterly',
|
||||
'label': 'par trimestre',
|
||||
'levels': ['year', 'quarter']
|
||||
},
|
||||
{
|
||||
'name': 'weekly',
|
||||
'label': 'par semaine',
|
||||
'levels': ['year', 'week']
|
||||
},
|
||||
{
|
||||
'name': 'dowly',
|
||||
'label': 'par jour de la semaine',
|
||||
'levels': ['dow']
|
||||
},
|
||||
]
|
||||
'label': 'date de la demande',
|
||||
'join': ['receipt_time'],
|
||||
'type': 'date',
|
||||
'value': 'receipt_time.date',
|
||||
},
|
||||
{
|
||||
'label': 'canaux',
|
||||
'name': 'channels',
|
||||
'name': 'channel',
|
||||
'label': 'canal',
|
||||
'join': ['channel'],
|
||||
'type': 'integer',
|
||||
'value': 'channel.id',
|
||||
'value_label': 'channel.label',
|
||||
},
|
||||
{
|
||||
'label': 'catégories',
|
||||
'name': 'categories',
|
||||
'name': 'category',
|
||||
'label': 'catégorie',
|
||||
'join': ['formdef', 'category'],
|
||||
'type': 'integer',
|
||||
'value': 'category.id',
|
||||
'value_label': 'category.label',
|
||||
},
|
||||
{
|
||||
'label': 'formulaire',
|
||||
'name': 'formdef',
|
||||
'label': 'formulaire',
|
||||
'join': ['formdef'],
|
||||
'type': 'integer',
|
||||
'value': 'formdef.id',
|
||||
'value_label': 'formdef.label',
|
||||
},
|
||||
{
|
||||
'label': 'statuts génériques',
|
||||
'name': 'generic_statuses',
|
||||
'name': 'generic_status',
|
||||
'label': 'statut générique',
|
||||
'join': ['generic_status'],
|
||||
'type': 'integer',
|
||||
'value': 'generic_status.id',
|
||||
'value_label': 'generic_status.label',
|
||||
},
|
||||
{
|
||||
'name': 'hour',
|
||||
'label': 'heure',
|
||||
'name': 'hours',
|
||||
'levels': [
|
||||
{
|
||||
'name': 'hours',
|
||||
'attributes': ['hour_id', 'hour_label'],
|
||||
'order_attribute': 'hour_id',
|
||||
'label_attribute': 'hour_label',
|
||||
}
|
||||
]
|
||||
},
|
||||
'join': ['hour'],
|
||||
'type': 'integer',
|
||||
'value': 'hour.id',
|
||||
'filter': False,
|
||||
}
|
||||
],
|
||||
'mappings': {
|
||||
'receipt_time.year': {
|
||||
'table': 'dates',
|
||||
'column': 'date',
|
||||
'schema': 'public',
|
||||
'extract': 'year',
|
||||
},
|
||||
'receipt_time.month': {
|
||||
'table': 'dates',
|
||||
'column': 'date',
|
||||
'schema': 'public',
|
||||
'extract': 'month'
|
||||
},
|
||||
'receipt_time.month_name': {
|
||||
'table': 'dates',
|
||||
'schema': 'public',
|
||||
'column': 'month'
|
||||
},
|
||||
'receipt_time.week': {
|
||||
'table': 'dates',
|
||||
'column': 'date',
|
||||
'schema': 'public',
|
||||
'extract': 'week'
|
||||
},
|
||||
'receipt_time.day': {
|
||||
'table': 'dates',
|
||||
'column': 'date',
|
||||
'schema': 'public',
|
||||
'extract': 'day'
|
||||
},
|
||||
'receipt_time.dow': {
|
||||
'table': 'dates',
|
||||
'column': 'date',
|
||||
'schema': 'public',
|
||||
'extract': 'dow'
|
||||
},
|
||||
'receipt_time.dow_name': {
|
||||
'table': 'dates',
|
||||
'schema': 'public',
|
||||
'column': 'day',
|
||||
},
|
||||
'receipt_time.quarter': {
|
||||
'table': 'dates',
|
||||
'column': 'date',
|
||||
'schema': 'public',
|
||||
'extract': 'quarter'
|
||||
},
|
||||
'formdef': 'formdef.label',
|
||||
'channels': 'channel.label',
|
||||
'categories': 'category.label',
|
||||
'generic_statuses': 'status.label',
|
||||
'hours.hour_label': '{hour_table}.label',
|
||||
'hours.hour_id': '{hour_table}.id',
|
||||
},
|
||||
'cubes': [
|
||||
'measures': [
|
||||
{
|
||||
'name': schema + '_formdata',
|
||||
'label': 'Toutes les demandes (%s)' % schema,
|
||||
'key': 'id',
|
||||
'fact': 'formdata',
|
||||
'dimensions': [
|
||||
'receipt_time',
|
||||
'hours',
|
||||
'channels',
|
||||
'categories',
|
||||
'formdef',
|
||||
'generic_statuses',
|
||||
],
|
||||
'joins': [
|
||||
{
|
||||
'name': 'receipt_time',
|
||||
},
|
||||
{
|
||||
'name': 'hour',
|
||||
},
|
||||
{
|
||||
'name': 'channel',
|
||||
},
|
||||
{
|
||||
'name': 'formdef',
|
||||
},
|
||||
{
|
||||
'name': 'category',
|
||||
},
|
||||
{
|
||||
'name': 'generic_status',
|
||||
},
|
||||
],
|
||||
'measures': [
|
||||
{
|
||||
'name': 'endpoint_delay',
|
||||
'label': 'délai de traitement',
|
||||
'nonadditive': 'all',
|
||||
},
|
||||
],
|
||||
'aggregates': [
|
||||
{
|
||||
'name': 'record_count',
|
||||
'label': 'nombre de demandes',
|
||||
'function': 'count'
|
||||
},
|
||||
{
|
||||
'name': 'endpoint_delay_max',
|
||||
'label': 'délai de traitement maximum',
|
||||
'measure': 'endpoint_delay',
|
||||
'function': 'max',
|
||||
},
|
||||
{
|
||||
'name': 'endpoint_delay_avg',
|
||||
'label': 'délai de traitement moyen',
|
||||
'measure': 'endpoint_delay',
|
||||
'function': 'avg',
|
||||
},
|
||||
],
|
||||
'name': 'count',
|
||||
'label': 'nombre de demandes',
|
||||
'type': 'integer',
|
||||
'expression': 'count({fact_table}.id)',
|
||||
},
|
||||
],
|
||||
{
|
||||
'name': 'avg_endpoint_delay',
|
||||
'label': 'délai de traitement moyen',
|
||||
'type': 'duration',
|
||||
'expression': 'avg(endpoint_delay)',
|
||||
},
|
||||
{
|
||||
'name': 'max_endpoint_delay',
|
||||
'label': 'délai de traitement maximum',
|
||||
'type': 'duration',
|
||||
'expression': 'max(endpoint_delay)',
|
||||
},
|
||||
{
|
||||
'name': 'min_endpoint_delay',
|
||||
'label': 'délai de traitement minimum',
|
||||
'type': 'duration',
|
||||
'expression': 'min(endpoint_delay)',
|
||||
},
|
||||
{
|
||||
'name': 'percent',
|
||||
'label': 'Pourcentage des demandes',
|
||||
'type': 'percent',
|
||||
"expression": 'count({fact_table}.id) * 100. '
|
||||
'/ (select count({fact_table}.id) from {table_expression} '
|
||||
'where {where_conditions})',
|
||||
}
|
||||
]
|
||||
}
|
||||
# apply table names
|
||||
self.model = self.tpl(self.model)
|
||||
self.model['cubes'].append(cube)
|
||||
self.base_cube = self.model['cubes'][0]
|
||||
|
||||
def hash_table_name(self, table_name):
|
||||
|
@ -473,39 +338,32 @@ class WcsOlapFeeder(object):
|
|||
['id', 'serial primary key'],
|
||||
['formdef_id', 'smallint REFERENCES {form_table} (id)'],
|
||||
['receipt_time', 'date'],
|
||||
['year_id', 'smallint REFERENCES {year_table} (id)'],
|
||||
['month_id', 'smallint REFERENCES {month_table} (id)'],
|
||||
['hour_id', 'smallint REFERENCES {hour_table} (id)'],
|
||||
['day_id', 'smallint REFERENCES {day_table} (id)'],
|
||||
['dow_id', 'smallint REFERENCES {dow_table} (id)'],
|
||||
['channel_id', 'smallint REFERENCES {channel_table} (id)'],
|
||||
['backoffice', 'boolean'],
|
||||
['generic_status_id', 'smallint REFERENCES {generic_status_table} (id)'],
|
||||
['endpoint_delay', 'real'],
|
||||
['endpoint_delay', 'interval'],
|
||||
]
|
||||
self.comments = {
|
||||
'formdef_id': u'dim$formulaire',
|
||||
'receipt_time': u'time$date de réception',
|
||||
'year_id': u'dim$année',
|
||||
'month_id': u'dim$mois',
|
||||
'hour_id': u'dim$heure',
|
||||
'day_id': u'dim$jour',
|
||||
'dow_id': u'dim$jour de la semaine',
|
||||
'channel_id': u'dim$canal',
|
||||
'backoffice': u'dim$soumission backoffce',
|
||||
'generic_status_id': u'dim$statut générique',
|
||||
'endpoint_delay': u'measure$délai de traitement',
|
||||
'formdef_id': u'formulaire',
|
||||
'receipt_time': u'date de réception',
|
||||
'hour_id': u'heure',
|
||||
'channel_id': u'canal',
|
||||
'backoffice': u'soumission backoffce',
|
||||
'generic_status_id': u'statut générique',
|
||||
'endpoint_delay': u'délai de traitement',
|
||||
}
|
||||
self.create_table('{generic_formdata_table}', self.columns)
|
||||
for at, comment in self.comments.iteritems():
|
||||
self.ex('COMMENT ON COLUMN {generic_formdata_table}.%s IS %%s' % at, vars=(comment,))
|
||||
|
||||
def feed(self):
|
||||
self.do_schema()
|
||||
self.do_base_table()
|
||||
if self.do_feed:
|
||||
self.do_schema()
|
||||
self.do_base_table()
|
||||
for formdef in self.formdefs:
|
||||
try:
|
||||
formdef_feeder = WcsFormdefFeeder(self, formdef)
|
||||
formdef_feeder = WcsFormdefFeeder(self, formdef, do_feed=self.do_feed)
|
||||
formdef_feeder.feed()
|
||||
except WcsApiError, e:
|
||||
# ignore authorization errors
|
||||
|
@ -513,14 +371,19 @@ class WcsOlapFeeder(object):
|
|||
and e.args[2].response.status_code == 403):
|
||||
continue
|
||||
self.logger.error('failed to retrieve formdef %s', formdef.slug)
|
||||
if 'cubes_model_dirs' in self.config:
|
||||
model_path = os.path.join(self.config['cubes_model_dirs'], '%s.model' % self.schema)
|
||||
with open(model_path, 'w') as f:
|
||||
json.dump(self.model, f, indent=2, sort_keys=True)
|
||||
|
||||
|
||||
class WcsFormdefFeeder(object):
|
||||
def __init__(self, olap_feeder, formdef):
|
||||
def __init__(self, olap_feeder, formdef, do_feed=True):
|
||||
self.olap_feeder = olap_feeder
|
||||
self.formdef = formdef
|
||||
self.status_mapping = {}
|
||||
self.items_mappings = {}
|
||||
self.do_feed = do_feed
|
||||
self.fields = []
|
||||
|
||||
@property
|
||||
|
@ -614,25 +477,16 @@ class WcsFormdefFeeder(object):
|
|||
continue
|
||||
|
||||
status = data.formdef.schema.workflow.statuses_map[data.workflow.status.id]
|
||||
if data.endpoint_delay:
|
||||
endpoint_delay = (data.endpoint_delay.days + float(data.endpoint_delay.seconds) /
|
||||
86400.)
|
||||
else:
|
||||
endpoint_delay = None
|
||||
row = {
|
||||
'formdef_id': self.formdef_sql_id,
|
||||
'receipt_time': data.receipt_time,
|
||||
'year_id': data.receipt_time.year,
|
||||
'month_id': data.receipt_time.month,
|
||||
'day_id': data.receipt_time.day,
|
||||
'hour_id': data.receipt_time.hour,
|
||||
'dow_id': data.receipt_time.weekday(),
|
||||
'channel_id': self.channel_to_id[data.submission.channel.lower()],
|
||||
'backoffice': data.submission.backoffice,
|
||||
# FIXME "En cours"/2 is never used
|
||||
'generic_status_id': 3 if status.endpoint else 1,
|
||||
'status_id': self.status_mapping[data.workflow.status.id],
|
||||
'endpoint_delay': endpoint_delay,
|
||||
'endpoint_delay': data.endpoint_delay,
|
||||
}
|
||||
# add form fields value
|
||||
for field in self.fields:
|
||||
|
@ -673,78 +527,78 @@ class WcsFormdefFeeder(object):
|
|||
})
|
||||
|
||||
# create cube
|
||||
self.cube = copy.deepcopy(self.base_cube)
|
||||
self.cube.update({
|
||||
'name': self.schema + '_' + self.table_name,
|
||||
cube = self.cube = copy.deepcopy(self.base_cube)
|
||||
cube.update({
|
||||
'name': self.table_name,
|
||||
'label': self.formdef.schema.name,
|
||||
'fact': self.table_name,
|
||||
'fact_table': self.table_name,
|
||||
'key': 'id',
|
||||
})
|
||||
cube['dimensions'] = [dimension for dimension in cube['dimensions']
|
||||
if dimension['name'] not in ('category', 'formdef')]
|
||||
|
||||
# add dimension for status
|
||||
self.cube['joins'].append({
|
||||
cube['joins'].append({
|
||||
'name': 'status',
|
||||
'table': self.status_table_name,
|
||||
'master': 'status_id',
|
||||
'detail': '%s.id' % self.status_table_name,
|
||||
'method': 'detail',
|
||||
'detail': 'id',
|
||||
})
|
||||
dim_name = '%s_%s' % (self.table_name, 'status')
|
||||
self.model['dimensions'].append({
|
||||
'name': dim_name,
|
||||
cube['dimensions'].append({
|
||||
'name': 'status',
|
||||
'label': 'statut',
|
||||
'levels': [
|
||||
{
|
||||
'name': 'status',
|
||||
'attributes': ['status_id', 'status_label'],
|
||||
'order_attribute': 'status_id',
|
||||
'label_attribute': 'status_label',
|
||||
},
|
||||
],
|
||||
'join': ['status'],
|
||||
'type': 'integer',
|
||||
'value': 'status.id',
|
||||
'value_label': 'status.label',
|
||||
})
|
||||
self.model['mappings']['%s.status_id' % dim_name] = '%s.id' % self.status_table_name
|
||||
self.model['mappings']['%s.status_label' % dim_name] = '%s.label' % self.status_table_name
|
||||
self.cube['dimensions'].append(dim_name)
|
||||
|
||||
# add dimension for function
|
||||
for function, name in self.formdef.schema.workflow.functions.iteritems():
|
||||
at = 'function_%s' % slugify(function)
|
||||
dim_name = '%s_function_%s' % (self.table_name, slugify(function))
|
||||
self.cube['joins'].append({
|
||||
cube['joins'].append({
|
||||
'name': at,
|
||||
'table': 'role',
|
||||
'master': at,
|
||||
'detail': self.tpl('{role_table}.id'),
|
||||
'alias': at,
|
||||
'detail': 'id',
|
||||
})
|
||||
self.model['dimensions'].append({
|
||||
'name': dim_name,
|
||||
cube['dimensions'].append({
|
||||
'name': at,
|
||||
'label': u'fonction %s' % name,
|
||||
'join': [at],
|
||||
'type': 'integer',
|
||||
'value': '%s.id' % at,
|
||||
'value_label': '%s.label' % at,
|
||||
'filter': False,
|
||||
})
|
||||
self.model['mappings'][dim_name] = '%s.label' % at
|
||||
self.cube['dimensions'].append(dim_name)
|
||||
|
||||
# add dimensions for item fields
|
||||
for field in self.fields:
|
||||
if field.type != 'item':
|
||||
continue
|
||||
table_name = self.hash_table_name('{formdata_table}_field_%s' % field.varname)
|
||||
self.cube['joins'].append({
|
||||
cube['joins'].append({
|
||||
'name': field.varname,
|
||||
'table': 'table_name',
|
||||
'master': 'field_%s' % field.varname,
|
||||
'detail': '%s.id' % table_name,
|
||||
'method': 'detail',
|
||||
'detail': 'id' % table_name,
|
||||
})
|
||||
dim_name = '%s_%s' % (self.table_name. field.varname)
|
||||
self.model['dimensions'].append({
|
||||
'name': dim_name,
|
||||
cube['dimensions'].append({
|
||||
'name': field.varname,
|
||||
'label': field.label,
|
||||
'join': [field.varname],
|
||||
'type': 'integer',
|
||||
'value': '%s.id' % field.varname,
|
||||
'value_label': '%s.label' % field.varname,
|
||||
'filter': field.in_filters,
|
||||
})
|
||||
self.model['mappings'][dim_name] = '%s.label' % table_name
|
||||
self.cube['dimensions'].append(dim_name)
|
||||
|
||||
self.model['cubes'].append(self.cube)
|
||||
try:
|
||||
self.logger.info('feed formdef %s', self.formdef.slug)
|
||||
self.do_statuses()
|
||||
self.do_data_table()
|
||||
self.do_data()
|
||||
finally:
|
||||
self.olap_feeder.ctx.pop()
|
||||
if 'cubes_model_dirs' in self.config:
|
||||
model_path = os.path.join(self.config['cubes_model_dirs'], '%s.json' % self.schema)
|
||||
with open(model_path, 'w') as f:
|
||||
json.dump(self.model, f, indent=2, sort_keys=True)
|
||||
self.model['cubes'].append(cube)
|
||||
if self.do_feed:
|
||||
try:
|
||||
self.logger.info('feed formdef %s', self.formdef.slug)
|
||||
self.do_statuses()
|
||||
self.do_data_table()
|
||||
self.do_data()
|
||||
finally:
|
||||
self.olap_feeder.ctx.pop()
|
||||
|
|
Loading…
Reference in New Issue