Binary Results Export

23 min

overview new binary format allows to store variadic structure data in the most compact format the format has the following data layout 1 fixed size file header file header stores such information as magic word, format version, content version, entry count, custom header size and content identifier 2 custom file header with size defined by custom header size taken from file header (in case if custom header size is not zero) 3 sequence of fixed size file entry headers, each of those contain metadata about specified entry entry header has the following data version, entry unique identifier, data offset, fixed item size (or 0 if item size is not fixed), custom entry header size (or zero is entry has no custom header), item count, data size (size of the data in bytes before compression), result data size (size of the data in bytes) after compression, compression level (0 means no compression for the entry is enabled) and data flags 4 sequence of entry data itself each entry if described by the following sections custom entry header (if specified), data offsets (in bytes, each index is presented by ` size t ` (` uint64 t `) type, is used to address items by index if item size is not fixed), entry data itself note don’t forget to uncompress raw data if ` compressionlevel != 0 ` using ` zlib ` result file setup and export the results file is a portable file that contained the classification results, intended to automatically share the results between systems results files can be exported as either a human readable docid\ yahokjegvm99gkhmbmt9o , a sqlite based docid mwxvawdypqiaurld6duf , or the compact and portable docid\ aonct13cbxvn eqsuxhxy the results export is designed to hold classification results without the raw data or can classification methods the results file is not a backup or restore file neither the x ray classification table or the classification workflow is stored in this file this file can be shared with end users and they cannot reconstruct the classification methodology setup to set a default results file export on the completion of reclassification do the following open default settings expand the section "calculation result export options" select the export format as json, mre, or binary select export folder locations set export options some options only to specific exports see each type below for specifics the system can be set to save the results file in two locations, and the folder non pm folder pm this folder is folder non pm this folder is default path this folder is export segments cannot be applied in this format export points cannot be applied in this format export spectra cannot be applied to this format export images cannot be applied in this format quant materials cannot be applied in this format use compression cannot be applied in this format use indentation cannot be applied in this format settings that are not applicable are ignored selecting them will have no impact on the file export automatic generation and update if an export format has been selected in the settings the file is automatically generated the first time the file is classified the file is automatically updated after each classification the file will be updated after each classification in batch processes manual edits or saves are only applied after reclassification or manual file update, not on file save or apply manual generation and update to generate or update the file without classification do the following steps the export format file type must be selected in the settings activate the "other tools" ribbon select the "update classification results file" optionally select to open the folder location binary settings in amics process open the docid\ hwyar6 exfvz2pw8auh2t by selecting the "wrench" icon specify the settings required for the binary file export and paths to the pm and non pm export locations the following settings may be applied export segments includes the segment table in the binary export each segment has an entry and this is an effective archive of the classification table from the export points adds a table into the file for any materials in the tmq screen with the x y locations and quantification results export spectra an archive copy of all spectra in the database export images includes the image layers as tiles in the export quant materials will apply the selected material quantification if specified, or the standard quantification if not specified, to a sum spectra from each material the result is presented in the material table use compression exports the file as a zip archive use indentation cannot be applied to this format settings that are not applicable are ignored selecting them will have no impact on the file export when set, the export will be updated at the end of each classification it is also possible to manually initiate the export by selecting the "update calculations results file" in the other tools ribbon layout originally the format was implemented in c++, so constants and structures layout will be described using c++ the next information is suitable for `0` format version structures constexpr static int format version = 0; constexpr static int magic word = 0x12345678; constexpr static int entry identifier buffer size = 64; constexpr static int file header identifier buffer size = 64; enum dataflags size t { none }; // disable packing for storage structures \#pragma pack(push, 1) struct fileheader { int magicword; // magic word size t headersize; // size of this header (in bytes) int formatversion; // format version counter (file or entry i e entry version might change, but file version remains the same) int contentversion; // content version counter size t entrycount; // count of entries size t customheadersize; // size of custom header (in bytes) wchar t contentidentifier\[file header identifier buffer size]; // contains hint about file content }; struct fileentryheader { size t headersize; // size of this header (in bytes) int version; // content version counter wchar t identifier\[entry identifier buffer size]; // unique identifier of the entry, for example "particles" size t dataoffset; // offset in file, if entry has header then points to entry header i e data will be `dataoffset + customheadersize + itemoffsets (if size not fixed)` size t fixeditemsize; // size == 0 means not fixed size t customentryheadersize; // size == 0 means no custom header size t itemcount; // count of items size t datasize; // initial data size size t resultdatasize; // compressed data size int compressionlevel; // 0 means not compressed dataflags flags; // some flags of the entry }; \#pragma pack(pop) fixed items to read fixed items just set stream pointer to the entry data offset, then simply allocate buffer big enough to store all of the data, read the data (don’t forget to decompress the retrieved data if compressed i e ` compressionlevel != 0 `) and access them like it working with flat array unfixed items to read unfixed items just set stream pointer to the entry data offset, then allocate buffer to store items offset (size will be ` sizeof(size t) itemcount `) and read the item offsets array, after allocate buffer big enough to store all of the raw data, read the data (don’t forget to decompress the retrieved data if compressed i e ` compressionlevel != 0 `) and access them by offsets taken from items offset array results file binary file with amics calculation results has content identifier `amics calc results` the next information is suitable for `1` content version entries ultimately (when all export options are enabled) results file might have the following entries · “metadata” (metadata) · “material” (fixed `materialdto` structure) · “material spectra” (spectra) · “particle” (fixed `particledto` structure) · “particle images bse” (images) · “particle images secondary bse” (images/optional) · “particle images classified” (images) · “particle material composition” (fixed `particlematerialcompositiondto` structure) · “grain” (fixed `graindto` structure) · “grain association” (fixed `grainassociation` structure) · “segment” (fixed `segment` structure) · “xray point” (fixed `xraypoint` structure) · “xray point spectra” (spectra) · “custom point” (fixed `custompointdto` structure) · “field” (fixed `fielddto` structure) · “field images bse” (images) · “field images secondary bse” (images/optional) · “field images classified” (images) · “material modal” (fixed `materialmodaldto` structure) · “calculated elements assay” (fixed `calculatedassaydto` structure) · “quantified elements assay” (fixed `calculatedassaydto` structure) · “material composition” (fixed `materialcompositiondto` structure) · “calculated material composition” (fixed `materialcompositiondto` structure) · “quantified material composition” (fixed `materialcompositiondto` structure) · “material association” (fixed `materialassociation` structure) · “material association parameter” (fixed `materialassociationparameter` structure) metadata metadata items are non fixed size data each item starts with `metadataitemheader` which contains the item identifier length (count of symbols), type (int32/string and etc ) and size of value (in bytes) identifier itself goes right after the header (buffer size for the identifier will be `(identifierlength + 1) 2` (buffer for null terminated string of length `identifierlength`) the data itself goes after the identifier currently amics export the following metadata key value pairs · “software version” (string) version of software used for data export · “software name” (string) name of software used for data export · “method” (string) name of the method used for classification · “project name” (string) name of the project · “sample name” (string) name of the sample · “sample uuid” (string) sample unique identifier · “duration” (double) measurement duration (in seconds) · “experiment method” (string) name of the method used for classification, currently the same as “method” · “experiment name” (string) name of the experiment · “experiment uuid” (string) unique identifier of the experiment · “file location” (string) location of original measurement file used for the results file generation · “license experiment” (string) license number of user that captured the measurement · “license source” (string) license number of user that generated the file · “license first update” (string) license number of user that initially generate the file · “license flast update” (string) license number of user that was the last to update the file · “material grouping name” (string) name of material grouping used during results generation · “measurement time” (string) formatted date when measurement was started · “pixel size” (double) size of pixel, used for pixel to micron conversion · “results creation time” (string) formatted date when the file was created · “specimen name” (string) name of the specimen · “total area” (uint64) total area (sum in pixels) of exported particles · “total weight” (double) total weight (sum) of exported particles · “has secondary data” (int32) `1` if secondary data is presented in exported measurement, `0` if not images images are non fixed size data, so to access images by index use item offsets array each image item consists of `imageitemheader` structure, which allows to retrieve image pixel format (uint8, uint16, uint32, int8, int16, int32, rgba32 and etc), image width, height actual image pixel data lays right after the image item header note if image item pixel format is equal to `imagepixelformat none` then it means that image is not available for the particular item spectra spectra is fixed size data (each item has size `channelcount sizeofchannel` with custom header presented by `spectraentryheader` structure, which allows to get channel count, channel format (int/double/float and etc ), combine and sum methods names by default, amics exports `int32` channels for spectra structures static const wchar t content identifier = l"amics calc results"; typedef double float for export; typedef wchar t char for export; \#pragma pack(push, 1) struct rectdto { int x, y, width, height; }; struct point2ddto { int x, y; }; enum class spectrachannelformat uint8 t { int8, int16, int32, uint8, uint16, uint32, float, double }; static constexpr size t method name buffer size = 64; struct spectraentryheader { size t channelcount; spectrachannelformat format; char for export combinemethodname\[method name buffer size]; char for export summethodname\[method name buffer size]; }; enum class metadataitemtype uint8 t { int32, uint32, int64, uint64, float, double, string, blob }; struct metadataitemheader { size t identifierlength; size t datasize; metadataitemtype type; }; enum class imagepixelformat uint8 t { none, custom, uint8, uint16, uint32, int8, int16, int32, rgba32 }; struct imageitemheader { uint32 t width; uint32 t height; uint32 t bytesperpixel; imagepixelformat format; }; struct objectbasedto { int id; int minbse; int maxbse; int averagebse; size t xraycount; size t areainpixels; float for export areainmicrons; float for export weightpercent; float for export areapercent; float for export density; float for export weight; rectdto boundingrectinmeasurement; }; struct objectbaseexdto public objectbasedto { float for export size; float for export equcircle; float for export equellipse; float for export maxlength; float for export minwidth; float for export perimeter; }; struct segmentdto public objectbasedto { int particleid; int grainid; int materialid; }; struct graindto public objectbaseexdto { int particleid; int materialid; size t segmentcount; float for export freeperimeter; }; struct pointbasedto { int id; point2ddto positioninmeasurement; point2ddto positioninstage; }; struct xraypointdto public pointbasedto { int particleid; int grainid; int segmentid; int materialid; int greylevel; int totalcounts; }; static constexpr size t max custom point name buffer size = 256; struct custompointdto public pointbasedto { wchar t name\[max custom point name buffer size]; }; struct particledto public objectbaseexdto { float for export shapefactor; float for export minaxismetricsx; float for export maxaxismetricsx; size t segmentcount; size t graincount; float for export hullarea; float for export hullperimeter; }; struct particlematerialcompositiondto { int particleid; int materialid; size t graincount; size t particleareainpixels; size t materialareainpixels; float for export particleweight; float for export materialareainmicrons; float for export materialareapercent; float for export materialwtpercent; }; struct grainassociationdto { int particleid; int grainid 1; int grainid 2; float for export assoclength; }; struct fielddto { int id; size t segmentcount; size t xraycount; rectdto boundingrectinmeasurement; rectdto scanfieldrect; }; constexpr static int element name max size = 4; struct calculatedassaydto { float for export percent; size t particlecount; char for export elementname\[element name max size]; }; struct materialcompositiondto { int materialid; char for export elementname\[element name max size]; float for export wtpercent; }; struct materialassociationdto { int sourcematerialid; int targetmaterialid; size t particlecount; float for export percent; }; struct materialassociationparameterdto { int materialid = 0; size t particlecount = 0; float for export value = 0 0; }; constexpr static int material name max size = 256; constexpr static int material formula max size = 256; struct materialmodaldto { int materialid; float for export weightpercent; float for export totalweight; float for export areamicrons; float for export areapercent; float for export density; size t areapixels; size t particlecount; size t xraycount; size t graincount; size t segmentcount; uint32 t colorhex; char for export name\[material name max size]; }; struct materialdto { int id; uint32 t colorhex; int totalcounts; float for export density; float for export atomicnumber; char for export chemicalformula\[material formula max size]; char for export name\[material name max size]; }; \#pragma pack(pop) reader this code is provided as a possible data reader implementation more versions will be provided as they are developed c++ stream stream = …; // stream used as binary data input // read file header and validate fileheader fileheader; stream read(\&fileheader, sizeof(fileheader)); if (fileheader headersize != sizeof(fileheader) || fileheader magicword != magic word || fileheader contentidentifier != content identifier) { throw wrongfileexception(); } // read custom file header (if exist) array\<byte> customfileheader; if (fileheader customheadersize != 0) { customfileheader resize(fileheader customheadersize); stream read(customfileheader data(), customfileheader size()); } // read file entry headers array\<fileentryheader> array(fileheader entrycount); stream read(array data(), sizeof(fileentryheader) fileheader entrycount); // try to read data from all entries for (int i = 0; i < array size(); ++i) { fileentryheader entryheader = array\[i]; log("reading entry with identifier " + entryheader identifier); stream seek(entryheader dataoffset, stream from begin); // try to read custom entry header array\<byte> customentryheader; if (entryheader customentryheadersize != 0) { customentryheader resize(entryheader customentryheadersize); stream read(customentryheader data(), customentryheader size()); } // try to read item offsets (if data is unfixed) array\<uint64> itemoffsets; if (entryheader fixeditemsize == 0) { itemoffsets resize(entryheader itemcount); stream read(customentryheader data(), entryheader itemcount sizeof(uint64)); } // read the data itself array\<byte> data(entryheader resultdatasize); stream read(data data(), data size()); // uncompress the data if compressed and copy uncompressed data back to original buffer if (entryheader compressionlevel != 0) { array\<byte> tempbuffer(entryheader datasize); zlibuncompress(tempbuffer data(), data data(), data resultdatasize); memcpy(data data(), tempbuffer data(), data datasize); } // try to access all items by index for (int k = 0; k < entryheader itemcount; ++k) { const uint64 itemoffset = entryheader fixeditemsize != 0 ? entryheader fixeditemsize itemindex itemoffsets\[itemindex]; const byte itemptr = data data() + itemoffset; // todo do something with the item } } python this code is under development it will be posed as soon as it is ready c# this code is under development it will be posed as soon as it is ready