Feature Backlog

TinyOlap is work in progress. This is a high-level overview of the core features and overall development status.

Last Updated 9. April 2022


1. Available Core Feature

  • Cubes - Cubes define multidimensional space and store and process the actual data.

    • Values - Cubes are intended to be used for numerical data (floats), but can store any Python data or object type. Persistence is limited by the capabilities provided through pickling and unpickling of Python objects.

  • Dimensions - Dimensions define the axis of a cube. They main contain a list or hierarchy of members. Hierarchies defined the aggregation logic of cubes.

    • Members - String keys to access data. by defining a multidimensional address of all dimensions of a cube.

    • Member Alias - (1…N) Alias keys to access members. Helpful to provide access to members via multiple keys, e.g. a business keys and technical keys. Useful for data importing.

    • Subsets - (1…N) Lists of members. Useful for display or calculation purposes.

    • Attributes - (1…N) Attributes per member

  • Cells - Cells are python objects that provide easy access to cube cells. Most importantly they can be used in Python calculations and mainly behave as floats.

  • Area - Areas of data from a cube to define an orthogonal subspace. Useful for any kind of mass data manipulations (delete, copy etc.), provide basic arithmetic opertaions.

  • Rules - Rules add custom business or program logic to cubes. Rules are plain Python functions or methods and most often evaluate numeric expressions from data in cubes. Rules need to be registered to a cube and can (not must) be persisted in your TinyOlap database files. There are 3 different flavors of rules required to support real world business logic…

    1. All level rules - Get executed for cell requests on aggregated and base level cells.

    2. Base level rules - Get executed for cell requests on base level cells only.

    3. Aggregation level rules - Get executed for cell requests on aggregated cells only.

    4, Push rules - Get executed when data is entered or imported into a cube.

1. Available Advanced Feature

  • Slice - Minimal implementation of a static report layout for console output. Intended for TinyOlap development and demo purposes mainly.

  • SQLite Backend - (default) for persistence of TinyOlap databases in a file. Stores all artefacts of a TinyOlap database incl. rules and history (upcoming) in an SQLite database file.

    The single file approach is very handy to handle multiple instances or versions of a database.

    Note

    The SQLite backend is just for persistence. It is not used for any calculation purposes, therefore mainly just stores json objects.

  • Backend encryption - SQlite databases themselves do not support proper encryption. So everyone with access to an SQLite database file can open and read the contents of the database. But TinyOlap supports three levels of encryption on the data level.

    • NoEncryption - (default) data is saved as is. This is the fastest approach.

    • Obfuscation - Data is not encrypted but just obfuscated. This is secure enough for most use cases with non-highly-confidential data. But professional hackers will very likely be able to decrypt your data.

    • Encryption - Data is encrypted using Fernet symetric encryption. This can be seen as a secure encryption (although you’ll never know). Please be aware, that if you forget the password of an encrypted TinyOlap database, your work and data is lost. For further information, please visit: https://github.com/fernet/spec


  • Samples Databases - Samples to showcase how to build real world TinyOlap databases.

    • Tiny Database - A very small (tiny) database, entirely build from Python code. The database showcases TinyOlap’s basic features and usage and should be the starting point to get used to TinyOlap.

    • Huge Database - A larger database containing 1 million records generated by code. The database showcases the behaviour of TinyOlap with larger datasets. With this amount of data TinyOlap is still working fine, but is already outside the intended use.

    • Finance Database - A real world example for financial planning and reporting purposes. This database uses almost all features of Tinyolap and represents a real business case of larger enterprise with 100 legal entities, a rudimentary sales plan, HR plan for 10,000 employees, profit & loss statements and real-time currency conversion.

      Note

      The Finance Database is quite complex example, that would take days or even weeks for a consultant to build. Most of the code is related to create meaningful sample data, therefore the code to create the database and ingest data is what you should look at to not get confused.

    • Tiny42 Database - A very small (tiny) database that simulates a minimal IoT scenario with 42 machines producing some time series data and writing to 42 individual TinOlap databses in parallel (depending on your number of cores and settings). Then all the data from the 42 TinyOlap databases will be consolidated in 1 single Tinylop database.

    • Planespotter Database - Collects real-time flight data from https://opensky-network.org and builds a TinyOlap database from the returned data represention a distance-grid around the center (currently from Berlin ±250km, can be changed). A weird idea, isn’t it ;-)

    • Tutor Database - A small to medium sized OLAP data model for sales data. Based on a historic set of CSV and custom files from 1994. The dataset contains a 6-dimensional data model containg 134,433 records and was shipped as the sample database with MIS Alea at that time (and came on a single 3½-inch floppy disk).

    • Tutor Web Demo - A super rudimentary web frontend on top of the Tutor database. The idea was to provide something visual that help you to understand what an OLAP database actually looks like. The one and ony FastAPI <https://fastapi.tiangolo.com> was used to spin up a web-service.

2. Under Development

  • Web API - A web API server, utilizing FastAPI <https://fastapi.tiangolo.com>, to serve TinyOlap databases to web frontends and other clients.


3. Backlog


  • Cubes - Additional Cube features.

    • Cube/Cell Comments - A minimalistic discussion thread over cubes and cells, enabling users to discuss and exchange information. Maybe with attachments.

    • Splashing - The capability to enter values on aggregated cells to automatically process the break down to the bases level cells of a cube.

      • Distribution - Enter value on oe modify aggregated measure, evenly distribute values.

      • Copy - Copy from one member or multi-member-context to another.

      • Delete - Delete values and data areas.

      • Fill - Fill all cells with the same value.

      • Command Rules - Command rules are custom rules that get executed when a predefined keyword is entered by a user for a given cell content. Such rules need to specify the optional ‘command’ argument in the rules decorator.

        If entered in cells, commands must start with special character, e.g. ‘#’.

        @rule(cube:"sales", pattern:"Profit", command:"Double")
        def rule_profit(c: tinyolap.cell.Cell):
            # 'profit' is defined as 'sales' - 'cost'
            c["Sales"] *= 2
            c["Cost"] *= 2
        
        # Command to 'double the profit', only available on cells addressing the member 'Profit'.
        c.Execute("Double")         # explicit call
        c["Profit"] = "#Double"     # implicit call by setting a value
        
      • Build-In Command Rules - There should be also a list of build-in commands to execute generic action or data processing tasks to data. e.g.:

        • ‘Bookmark’ Command - Creates a named or unnamed bookmark for time travel.

        # Set a global bookmark for time travel. Both calls are identical.
        c.Execute("Bookmark", "Planning 2023", "Start of planning session 2023")
        database.trimetravel.add_bookmark("Planning 2023", "Start of planning session 2023")
        
      • Forecast - Extrapolates a series of values.

        # forecasts a single value based on 'actual' data for a given cell context
        # based on series derived from the subsequent members of the dimensions 'years' and 'months'
        c.forecast(["years", "months"], "data_type:actual")
        
    • History - Ability to travel back and forth over changes made to the data base in regards of structure and data. Available for in-memory and persistent databases, for in-memory databases only available throughout the current session (no persistence).

    • Custom aggregations - Aside of aggregations along the member hierarchies, this will enable the following aggregations individually and in combination:

      • Subset Aggregations - Aggregations based member subsets.

        # aggregate all member of the subset 'new cars' of dimension 'cars'
        total_of_new_cars = c["cars:new cars"]     # specific
        total_of_new_cars = c["new cars"]          # will also work, if no name conflicts occur
        
      • Attribute Aggregations - Aggregations based on member attribute values.

        # aggregate all member of dimension 'cars' that have attribute 'color' == 'blue'
        total_of_blue_cars = c["cars:color:blue"]   # specific
        total_of_blue_cars = c["color:blue"]        # will work, if no conflicts occur
        total_of_blue_cars = c["blue"]              # will work, if no conflicts occur
        
      • Multi-Member Aggregations - Aggregations based on a list of members.

        # aggregate the members 'sports' and 'sedan' of dimension 'cars'
        sports_and_sedan_cars = c["cars:sports, sedan"]  # specific
        sports_and_sedan_cars = c["sports, sedan"]       # will work, if no conflicts occur
        
        # aggregate all member of dimension 'cars' that have attribute 'color' == 'blue' or 'red'
        red_n_blue_cars = c["cars:color:blue, red"]  # specific
        red_n_blue_cars = c["color:blue, red"]       # will work, if no conflicts occur
        red_n_blue_cars = c["blue, red"]             # will work, if no conflicts occur
        
      • Wildcard Aggregations - Aggregations based on wildcard search (not regular expressions).

        # aggregate all member of the 'cars' dimension starting with 's'.
        total_of_new_cars = c["cars:s*"]  # specific, would return (sedan, sports)
        total_of_new_cars = c["s*"]       # will probably NOT work due to ambiguities over multiple dimensions
        
    • Cell protection - The ability to fix and protect cells from being changed, e.g. when splashing or deleting values.


  • Security and Authorization - To enable encryption and multi-user management.

    • see e.g.: https://github.com/fastapi-users/fastapi-users

    • User Management - The main idea is to know who has done what and when to enable collaboration and process data in the context or related to a user.

      There should be a user and user group concept. Rights are assigned to groups, users are assigned to groups. We need to further think about this…

    • Encryption - Secure encryption requires a single encryption key and therefore enycrypted database can only opened or started with the ‘admin’ account.

      Attention

      As SQLite does not support encryption and authorization out of the box we need to encrypt the content of the database by ourselves. As most data ist store in json, this is not a big thing to do.

    • Default Behavior - When a new database is created (by code), the default user is always ‘admin’ and no explicit login is required. For existing databases, without encryption and authorization enabled, also no explicit login should be required.

    • Authorization - Users should be restricted to read (see), write or modify certain cubes and members in dimensions. Authorization should be managed by dedicated cubes (like in MIS Alea).


  • Data Importers - Capabilities to easily import data from files and other source like

    Pandas data fames.

    • Auto Importer - Generate a database or cube from a file, incl. setup of dimensions and data import.

    • Pandas Importer - Generate a database or cube from one or more Pandas data frames.


  • Console GUI - A simple console gui for interaction with databases and cubes.


  • CI/CD - Automated CI/CD pipeline to publish to tinyolap.com.


  • Promotion - To inform others about TiynOlap.

    • One-Pager - A short document explaining the main features of TinyOlap.

    • Slide Deck - An introduction to TinyOlap.

    • Cheat Sheet - A cheat sheet for TinyOlap developers.

    • Blog post - Introduction to TinyOlap post, for medium etc.

    • Video - Introduction video to TinyOlap.

    • Homepage - A nice homepage.


4. Future Ideas

  • Port to JavaScript - It should be possible to port TinyOlap to Javascript to run a database TinyOlap database directly from within the browser as a client side application. Performance should be comparable to the current Python implementation.

    As an alternative TinyOlap could be wrapped in WASM to run in a browser.

  • Public Data Model Repository - A community driven directory of data models for various purposes. Either to provide data models to others or data. Either as code and files (preferrable) or as prebuild TinyOlap databases, with data or without.

  • Excel Client - A client to view and edit (enter data) TinyOlap databases.