Django and Oscar

Django Admin / manage.py dumpdata Subcommand

Published 2021-02-12. Last modified 2021-03-27.
Time to read: 2 minutes.

This page is part of the django collection.

dumpdata Subcommand

The dumpdata subcommand of manage.py dumps some or all of the django-oscar webapp data from the database. It provides trasparency of the database details.

Both the dumpdata and dataload subcommands have the same documentation error regarding the --database option. The documentation states that the option “Specifies the database from which data will be dumped. Defaults to default.”. Instead, the documentation should say that the option “Specifies the named database entry in settings. Defaults to default.”

By convention, database dumps should be stored in the fixtures directory of every app. However, when we make dumps for an app distributed via pip such as django-oscar, we need to provide a different directory to store the fixtures into. I created a directory called frobshop/fixtures/ for this purpose.

Shell
(aw) $ mkdir frobshop/fixtures/

Following the Django documentation on fixtures, I also added the following to frobshop/settings/base.py. However, I found no benefit in doing so, except to document to whomever might read the settings that fixtures live there. The dumpdata and loaddata subcommands of manage.py do not seem to use this information.

frobshop/settings/base.py
FIXTURE_DIRS = [
    'fixtures',
]

By default, the dumpdata subcommand of manage.py returns results in JSON format. The following example dumps all data for the entire django-oscar instance, and uses jq to pretty-print the JSON result before saving into frobshop/fixtures/all.json.

Shell
(aw) $ ./manage.py dumpdata | \
  jq > frobshop/fixtures/all.json

Jq can also parse JSON data and extract data subsets using queries. The following command displays formatted JSON associated with the catalogue app’s data, including data for models of catalogue.productclass, catalogue.category, catalogue.productattribute, catalogue.attributeoptiongroup, and catalogue.attributeoption:

Shell
(aw) $ ./manage.py dumpdata catalogue | \
  jq > frobshop/fixtures/catalogue.json

The following saves formatted JSON for all catalogue.attributeoption records into a file called frobshop/fixtures/catalogue.attributeoption.json, which I display.

Shell
(aw) $ ./manage.py dumpdata catalogue.attributeoption | \
  jq > frobshop/fixtures/catalogue.attributeoption.json

(aw) $ cat frobshop/fixtures/catalogue.attributeoption.json
[
  {
    "model": "catalogue.attributeoption",
    "pk": 1,
    "fields": {
      "group": 2,
      "option": "Soft red"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 3,
    "fields": {
      "group": 2,
      "option": "Soft green"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 4,
    "fields": {
      "group": 2,
      "option": "Soft blue"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 5,
    "fields": {
      "group": 2,
      "option": "Soft orange"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 6,
    "fields": {
      "group": 1,
      "option": "Unscented"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 7,
    "fields": {
      "group": 1,
      "option": "Bergamot - Bergaptene Free"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 8,
    "fields": {
      "group": 1,
      "option": "Cedarwood - Atlas"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 9,
    "fields": {
      "group": 1,
      "option": "Frankincense - India"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 10,
    "fields": {
      "group": 1,
      "option": "Ginger Root"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 11,
    "fields": {
      "group": 1,
      "option": "Lavender - natural blend"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 12,
    "fields": {
      "group": 1,
      "option": "Lemongrass"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 13,
    "fields": {
      "group": 1,
      "option": "Lemon - natural blend"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 14,
    "fields": {
      "group": 1,
      "option": "Orange - natural blend"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 15,
    "fields": {
      "group": 1,
      "option": "Spearmint"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 16,
    "fields": {
      "group": 1,
      "option": "Ylang Ylang"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 17,
    "fields": {
      "group": 1,
      "option": "Tea Tree - Organic"
    }
  },
  {
    "model": "catalogue.attributeoption",
    "pk": 18,
    "fields": {
      "group": 1,
      "option": "Turmeric - Organic"
    }
  }
] 

The following saves formatted JSON for only the catalogue.attributeoption records associated with the group having id 1 into a file called frobshop/fixtures/catalogue.attributeoption.fields.group.1.json:

Shell
(aw) $ ./manage.py dumpdata catalogue.attributeoption | \
  jq '.[] | select(.fields.group == 1)' > \
  frobshop/fixtures/catalogue.attributeoption.fields.group.1.json

extract Script

It seems natural to pipe the JSON output of manage.py dumpdata into jq for parsing, formatting and processing. I wrote a bash script called extract, which does that for 4 use cases:

  1. Dump the entire database and pretty-print the JSON.
    Shell
    (aw) $ bin/extract
    This does the same as the above:
    Shell
    (aw) $ bin/extract all
  2. Only dump a specified app’s data. This dumps all of the catalogue app’s data:
    Shell
    (aw) $ bin/extract catalogue
  3. Only dump a property of the specified app’s data. This dumps the attributeoption property value in the catalogue app’s data:
    Shell
    (aw) $ bin/extract catalogue.attributeoption
  4. Only dump a specified app’s data where a jq selector matches a value. This dumps the attributeoption property values in the catalogue app’s data where the fields.group subproperty has value 1:
    Shell
    (aw) $ bin/extract catalogue.attributeoption .fields.group 1

This is the source code for the script, which I place in the bin directory I created for the Frobshop webapp.

bin/extract
#!/bin/bash

unset APP
unset EXPRESSION
unset VALUE

if [ "$1" ]; then
  APP="$1"
  if [ "$APP" == all ]; then unset APP; fi
  shift
fi

if [ "$1" ]; then EXPRESSION="$1"; shift; fi

if [ "$1" ]; then VALUE="$1"; shift; fi

#echo "APP='$APP'; EXPRESSION='$EXPRESSION'; VALUE='$VALUE'"

if [ -z "$EXPRESSION" ]; then
  ./manage.py dumpdata $APP | jq
elif [ -z "$VALUE" ]; then
  ./manage.py dumpdata $APP | jq ".[]$EXPRESSION"
else
  ./manage.py dumpdata $APP | jq ".[] | select($EXPRESSION == $VALUE)"
fi

I also made a Python version of extract, using jq.py:

bin/extract.py
"""
Valid call syntaxes:
extract.py
extract.py my_app
extract.py my_app.app_property
extract.py my_app.app_property .property.sub_property
extract.py my_app.app_property .property.sub_property value


It would be nice to be able integrate this into manage.py. See
https://docs.djangoproject.com/en/3.1/howto/custom-management-commands/

One way to do this would be to override `django.contrib.admin`.
If this code was called `mslinn.admin.extras`, then it should be listed in
`INSTALLED_APPS` before `django.contrib.admin`, like this:

INSTALLED_APPS = [
    'mslinn.admin.extras',
    'django.contrib.admin',
     ...
]
"""

import json
import jq
import sys

def pprint(value):
    # value = json.loads(value)
    return json.dumps(value, indent=True, sort_keys=True)

"""
@return output of running command
"""
def run(command):
    import subprocess
    return subprocess \
               .run(command, stdout=subprocess.PIPE, shell=True, check=True) \
               .stdout \
               .decode("utf-8")


def main(argv):
    import jq

    app, expression, value, *extra = argv + ['' for x in range(3-len(argv))]
    app = '' if app == 'all' else app
    if extra or app=='-h':
        sys.exit(f"Syntax: {sys.argv[0]} [app [expression [value]]]")

    dumpdata = run(f'./manage.py dumpdata {app}')

    if not expression:
        result = json.loads(dumpdata)
    elif not value:
        result = jq.compile(f'.[] | {expression}').input(text=dumpdata).all()
    else:
        result = jq.compile(f'.[] | select({expression} == {value})').input(text=dumpdata).all()

    print(pprint(result))

main(sys.argv[1:])
* indicates a required field.

Please select the following to receive Mike Slinn’s newsletter:

You can unsubscribe at any time by clicking the link in the footer of emails.

Mike Slinn uses Mailchimp as his marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices.