Numpy <-> JSON over Pandas

The goal

Normally we can not use JSON with Numpy. However, if we use Pandas as an intermediary then we can do it.

Questions to David Rotermund

Note: Pandas can also be used for many other formats beside JSON.

Writing JSON pandas.DataFrame.to_json

DataFrame.to_json(path_or_buf=None, orient=None, date_format=None, double_precision=10, force_ascii=True, date_unit='ms', default_handler=None, lines=False, compression='infer', index=None, indent=None, storage_options=None, mode='w')[source]

Convert the object to a JSON string.

Note NaN’s and None will be converted to null and datetime objects will be converted to UNIX timestamps.

import numpy as np
import pandas as pd

rng = np.random.default_rng()

some_data = rng.random((11, 3))

df = pd.DataFrame(some_data)
# As file
filename = "mynumpydata.json"
df.to_json(filename, orient="index")

# As string
output = df.to_json(orient="index")
print(output)

Output (reformated):

{
    "0": {
        "0": 0.3145859169,
        "1": 0.2517001569,
        "2": 0.6685086575
    },
    "1": {
        "0": 0.7324177066,
        "1": 0.6750562092,
        "2": 0.0086333192
    },
    "2": {
        "0": 0.7529914827,
        "1": 0.3597052352,
        "2": 0.2780062722
    },
    "3": {
        "0": 0.2847410336,
        "1": 0.5572451873,
        "2": 0.5591149362
    },
    "4": {
        "0": 0.4507115703,
        "1": 0.9623511422,
        "2": 0.7180796014
    },
    "5": {
        "0": 0.5406601852,
        "1": 0.9315847158,
        "2": 0.2456480951
    },
    "6": {
        "0": 0.3441382077,
        "1": 0.4714817658,
        "2": 0.1777388975
    },
    "7": {
        "0": 0.6994839505,
        "1": 0.6520935819,
        "2": 0.9870686976
    },
    "8": {
        "0": 0.187576403,
        "1": 0.7466669157,
        "2": 0.2952841542
    },
    "9": {
        "0": 0.9140410582,
        "1": 0.6828387334,
        "2": 0.165762789
    },
    "10": {
        "0": 0.644055269,
        "1": 0.6122094952,
        "2": 0.9695111468
    }
}

Read JSON pandas.read_json

pandas.read_json(path_or_buf, *, orient=None, typ='frame', dtype=None, convert_axes=None, convert_dates=True, keep_default_dates=True, precise_float=False, date_unit=None, encoding=None, encoding_errors='strict', lines=False, chunksize=None, compression='infer', nrows=None, storage_options=None, dtype_backend=_NoDefault.no_default, engine='ujson')[source]

Convert a JSON string to pandas object.

import numpy as np
import pandas as pd

filename = "mynumpydata.json"
df = pd.read_json(filename, orient="index")
output_np = df.to_numpy()
print(type(output_np)) # -> <class 'numpy.ndarray'>
print(output_np)

Output:

[[0.31458592 0.25170016 0.66850866]
 [0.73241771 0.67505621 0.00863332]
 [0.75299148 0.35970524 0.27800627]
 [0.28474103 0.55724519 0.55911494]
 [0.45071157 0.96235114 0.7180796 ]
 [0.54066019 0.93158472 0.2456481 ]
 [0.34413821 0.47148177 0.1777389 ]
 [0.69948395 0.65209358 0.9870687 ]
 [0.1875764  0.74666692 0.29528415]
 [0.91404106 0.68283873 0.16576279]
 [0.64405527 0.6122095  0.96951115]]

The source code is Open Source and can be found on GitHub.