Sunday, December 16, 2012

Python code to simplify reading SciDB data

I've written some code in python that uses the SciDB python connector to access data in a more straightforward manner.  In summary, you submit a query and get back an iterator over the data.  It is currently very incomplete, and needs:

  1. Currently it just returns the attributes.  It needs to also return the values of the dimensions.   Done!
  2. Ability to reset the iterator - it can currently only be used once
The code is publicly available from this github repository:
https://github.com/dllahr/scidb_python_utils

Details / Background

The example code provided for using the SciDB connector required many lines of code in order to execute a query and read back the results.  I wanted a simpler system that behaves like other database readers in which you execute a query and then can iterate over the results.  To do this, I took the SciDB example code provided in 
/opt/scidb/12.10/share/scidb/examples/python/sample.py

and wrapped it in an iterator class in python.  Here is an example of how to use it that comes from the test case provided in the repository at




class TestScidbReader(unittest.TestCase):
    array_name = "test_scidb_read"

    def test_read(self):
        #connect to SciDB database
        scidb = scidbapi.connect("localhost", 1239)
        
        #instantiate reader by passing reference to database connection
        reader = scidb_read.scidb_reader.ScidbReader(scidb)
        
        #read for a specific query - in this case all the data in a previously setup array
        reader.read("scan({})".format(TestScidbReader.array_name))
        
        #iterate over the data returned by the query, printing it out
        #(data is returned as a list)
        for data in reader:
            print data
            
        #end the query
        reader.complete_query()
        
        #close the database connection
        scidb.disconnect()

No comments:

Post a Comment