63 lines
1.9 KiB
Plaintext
63 lines
1.9 KiB
Plaintext
this app's function will be to:
|
|
|
|
* install it's own tables (#todo: not yet automated)
|
|
* read SE xml dump into DjangoDB (automated)
|
|
* populate askbot database (automated)
|
|
* remove SE tables (#todo: not done yet)
|
|
|
|
Current process to load SE data into Askbot is:
|
|
==============================================
|
|
|
|
1) backup database
|
|
|
|
2) unzip SE dump into dump_dir (any directory name)
|
|
you may want to make sure that your dump directory in .gitignore file
|
|
so that you don't publish it by mistake
|
|
|
|
3) enable 'stackexchange' in the list of installed apps (probably aready in settings.py)
|
|
|
|
4) (optional - create models.py for SE, which is included anyway) run:
|
|
|
|
#a) run in-place removal of xml namspace prefix to make parsing easier
|
|
perl -pi -w -e 's/xs://g' $SE_DUMP_PATH/xsd/*.xsd
|
|
cd stackexchange
|
|
python parse_models.py $SE_DUMP_PATH/xsd/*.xsd > models.py
|
|
|
|
5) Install stackexchange models (as well as any other missing models)
|
|
python manage.py syncdb
|
|
|
|
6) make sure that badges are installed
|
|
if not, run (example for mysql):
|
|
|
|
mysql -u user -p dbname < sql_scripts/badges.sql
|
|
|
|
7) load SE data:
|
|
|
|
python manage.py load_stackexchange dump_dir
|
|
|
|
if anything doesn't go right - run 'python manage.py flush' and repeat
|
|
steps 6 and 7
|
|
|
|
NOTES:
|
|
============
|
|
|
|
Here is the load script that I used for the testing
|
|
it assumes that SE dump has been unzipped inside the tmp directory
|
|
|
|
#!/bin/sh$
|
|
python manage.py flush
|
|
#delete all data
|
|
mysql -u askbot -p aksbot < sql_scripts/badges.sql
|
|
python manage.py load_stackexchange tmp
|
|
|
|
Untested parts are tagged with comments starting with
|
|
#todo:
|
|
|
|
The test set did not have all the usage cases of StackExchange represented so
|
|
it may break with other sets.
|
|
|
|
The job takes some time to run, especially
|
|
content revisions and votes - may be optimized
|
|
|
|
Some of the fringe cases are described in file stackexchange/ANOMALIES
|