System = Ubuntu 9.10
Starting Reference = http://giantflyingsaucer.com/blog/?p=839
I just wanted to document a informative and procedural process in setting up python and mongodb and using the eclipse programming environment, it will also be used to reference the python programming project to organize my ebooks and documentation.
I normally will use MySQL for any projects that require a database and have been using MySQL along with the other commercial databases for some years now. But after reading the Linux Journal and learning that MongoDB is a NoSQL database, this has intrigued my interest and since normal databases are not particularly in my opinion suited to store documents – this has led me to start this personal project.
MongoDB Setup Steps
1. Install the curl application to pull the latest db version from the website.
$ apt-get install curl
2. Once curl is installed on the system – create a folder on your local system to serve and the database repository.
$ mkdir /home/t/data/db
3. Once the directory and created then download the latest build from mongodb
$ curl -O http://downloads.mongodb.org/linux/mongodb-linux-i686-latest.tgz
4. Then untar the download from the website
$ tar xzf mongodb-linux-i386-latest.tgz
5. The files will now reside in the /data/db path to start using mongo db simply start the data bast engine by
$ /data/db/mongodb-linux-i686-2010-05-10/bin/mongod &
6. Open another command terminal and execute the following command to get the mongo db interface
$ /data/db/mongodb-linux-i686-2010-05-01/bin/mongo
You will then get an interesting display of text using this version of mongo – so I guess this is the wrong db but the size it good for now
Sat May 1 13:02:23 Mongo DB : starting : pid = 2597 port = 27017 dbpath = /data/db/ master = 0 slave = 0 32-bit
****
WARNING: This is development version of MongoDB. Not recommended for production.
****
** NOTE: when using MongoDB 32 bit, you are limited to about 2 gigabytes of data
** see http://blog.mongodb.org/post/137788967/32-bit-limitations for more
At the command prompt of > will be given and from a security perspective unlike MySQL no password is required and I am unsure of the permission the default user is authorized. I will investigate this further after I get through the initial process
The help feature is a strong point for the mongodb – its great to get commands from the system without having to look them up.
7.>help
The help command returned the following results
show dbs show database names
show collections show collections in current database
show users show users in current database
show profile show most recent system.profile entries with time >= 1ms
db.help() help on DB methods
db.foo.help() help on collection methods
db.foo.find() list objects in collection foo
db.foo.find( { a : 1 } ) list objects in foo where a == 1
I used the web reference example about to create a test db entry and return the results
db.mystorage.save( { "message":"Hello World"} )
db.mystorage.findOne()
{ "_id" : ObjectId("4bdc88f8739846d101246a42"), "message" : "Hello World" }
I also added a little more testing and entered two more entries to mystorage database –
db.mystorage.save( { "message1":"Hello World"} )
db.mystorage.save( { "message2":"Hello World"} )
At this point I used the previously mentioned help function and entered the following command to see my entries within the mystorage db that is apparently created by default,
db.mystorage.find()
{ "_id" : ObjectId("4bdc88f8739846d101246a42"), "message" : "Hello World" }
{ "_id" : ObjectId("4bdc8a1c739846d101246a43"), "message1" : "Hello World" }
{ "_id" : ObjectId("4bdc8a21739846d101246a44"), "message2" : "Hello World" }
------------------------------------------------------------------------------------------------
This was part one of my process and now that I know the db portion is working on home development system it is now time to install python and start the programming process of accessing this database and writing an interface for storing and retrieving documents.
I have utilized C, C++, PHP, and Perl as my primary languages of choice and I am most recently refreshed with Perl, but after some reading on the Internet and some language comparison. I decided to give Python a chance for a couple of reasons.
1. Easier to read by a human compared to the cryptic nature of Perl.
2. I am not really creating a web programming structure so PHP is out the question.
3. I am not concerned with speed for this project and even though C++ is probably my favorite language it would be an overkill for this simple adventure.
So I have a new language to learn that has been mentioned to be more secure and easy for someone with Perl and C++ programming experience.
I will try and define some differences and provide an overall transition assessment of using a language that I have never used before in my previous experiences.
------------------------------------------------------------------------------------------------
Installing Python on my Ubuntu system.
The will start by installing the python-setuptools using the following commands
sudo apt-get install python-setuptools
This will give the necessary to install the pymongo code to support the connection to the database with the python programming language.
sudo easy_install pymongo
I did receive some error message when executing this command and will continue on the programming example from the above referenced website and hopefully will resolve the error, but based on my time constraint – I am pushing on.
.*************************************************************
WARNING: The pymongo._cbson extension module could not
be compiled. No C extensions are essential for PyMongo to run,
although they do result in significant speed improvements.
Above is the ouput showing how the compilation failed.
**************************************************************
At this point of time – I decided to conduct a google search to find the best text environment for the python language or one that will highlight the Python syntax. I usually use Eclipse and probably should start their and see if they support Python.
I did do a little searching and stumbled upon and aptana Pydev extension for eclipse for Python,
http://pydev.org/download.html
So I will try and get this working before I start the coding process of uploading my books.
It is very simple on ubuntu to install eclipse just use
sudo apt-get install eclipse
It was fairly simple to add the extension on Eclipse – I just used the Quick Install method on the pydev.org
Quick Install': Update Manager
Go to the update manager (Help - Install New Software), add:
http://pydev.org/updates
Select PyDev and clicked next and then accepted the license agreement and then I was ready to used Eclipse and Python to start the coding process.
Once you open eclipse and go to New Project and select Pydev Project from the wizard – it will ask you to define an interpreters and I selected among the choices – Iron python, Jython and Python
I gave it a project name MongoDB and used the default home location.
Project type – I selected Python and the grammer version was default 2.6 since I have no preference for grammer versions at this point.
I clicked on configure interpreter and auto config button and it configured by System libs automatically saving me some time.
This is were the confusion started at this point – Eclipse is not as easy to configure as one might think you have to read some documentation to get the environment working so I followed the following information to get this to work -
http://pydev.org/manual_101_root.html
I followed the manual and created a project
file - new - project - Pydev - Pydev project
Project Name: TestMongoDB
Use Defaults
Project Type = Python
GrammerVersion = 2.6
Interpreter Default
Create default 'src' folder and add it to the pythonpath? Is checked
Finish
I then followed the manual to create a python package
File - new - pydev package
I changed the Source Folder to my /TestMongoDB/src and Name = testmongo.test → Finish
After this you will need to create a pydev module
- File - new - pydev module
- Ensure the source folder is correct
- Package name is the same as the one you just created
- Template =
Once my cbentries.py file was created – used the following code to test the db, environment and code
'''
Created on May 1, 2010
@author: Tim
'''
from pymongo.connection import Connection
connection = Connection('localhost')
db = connection.mystorage
doc1 = {"timestamp":001, "msg":"Hello 1"}
doc2 = {"timestamp":002, "msg":"Hello 2"}
doc3 = {"timestamp":003, "msg":"Hello 3"}
db.mystorage.save(doc1)
db.mystorage.save(doc2)
db.mystorage.save(doc3)
cursor = db.mystorage.find()
for d in cursor:
print d
Once the code was modified for mystorage I then clicked on the Run As button and selected
Python Run → Ok
This was the final result of my test – Success – now I can work on getting documents and a user friendly interfact – once this has been completed then I will post the code and probably some lessons learned.
{u'msg': u'Hello 1', u'timestamp': 1, u'_id': ObjectId('4bdcbaee50f9092884000000')}
{u'msg': u'Hello 2', u'timestamp': 2, u'_id': ObjectId('4bdcbaee50f9092884000001')}
{u'msg': u'Hello 3', u'timestamp': 3, u'_id': ObjectId('4bdcbaee50f9092884000002')}
GridFS -----------------------------
I will need to use GridFS to store the acutal files into Mongo.
The database supports native storage of binary data within BSON objects. However, BSON objects in MongoDB are limited to 4MB in size.
The GridFS spec provides a mechanism for transparently dividing a large file among multiple documents. This allows us to efficiently store large objects, and in the case of especially large files, such as videos, permits range operations (e.g., fetching only the first N bytes of a file).
So now the adventure begins in trying to use GridFS to upload my large pdf files,
Reference:
http://dirolf.com/2010/03/29/new-gridfs-implementation-for-pymongo.html
Here is what I have so far using GridFS – this is working great – I will now start the process of searching through a folder and capturing names and extensions of files and auto importing them into my db.
'''
Created on May 1, 2010
@author: Tim
'''
from pymongo import Connection
connection = Connection('localhost')
from gridfs import GridFS
db = connection.mystorage
fs = GridFS(db)
with open("/home/t/Pictures/Photos/warty-final-ubuntu.png") as myimage:
oid = fs.put(myimage, content_type="image/pnp", filename="myimage")
print fs.get(oid)._file #THis will give me the file entries
print fs.get(oid)._id #This will give me the id of the entries
print fs.get(oid).__sizeof__() # This will give me the size of the file in memory
print fs.get(oid).name # This is the name of my file
you're awesome. thanks for this.
ReplyDelete