CSH Crowdsourcing Metadata Schema
Image
Label | Property | Range | Usage | Obligation |
ID | .id | String | The ID generated by MongoDB | 1 |
File properties | .file | File | The file properties for this image | 1 |
Scan properties | .scan | Scan | The page from which this image was extracted and its properties | 1 |
Can crowdsource | .canCrowdsource | String | Whether this image is approved to be crowdsourced (i.e., yes, no, maybe) | 0 - 1 |
Transcription properties | .transcription | Transcription | The transcription status and current labels for this image | 0 - 1 |
File
Label | Property | Range | Usage | Obligation |
Height | .file.height | Number | The height of the image in pixels | 1 |
Width | .file.width | Number | The width of the images in pixels | 1 |
Size | .file.size | Number | The size of the image in bytes | 1 |
Original file path | .file.origPath | String | The path with original file name | 1 |
Anonymized file path | .file.anonPath | String | The path with anonymized file name | 1 |
Scan
Label | Property | Range | Usage | Obligation |
Line number | .scan.lineNum | Number | The line number from which this image was extracted | 1 |
Item group number | .scan.itemGroupNum | Number | The register in which this image is a part of | 1 |
Word number | .scan.wordNum | Number | The word number from which this image was extracted | 1 |
Scan number | .scan.scanNum | Number | The page number from which this image was extracted | 1 |
Pixel location | .scan.pixelLocation | PixelLocation | The pixel location of the image with respect to the page | 1 |
PixelLocation
Label | Property | Range | Usage | Obligation |
x position | .scan.pixelLocation.x | Number | The x position of the image in pixels | 1 |
y position | .scan.pixelLocation.y | Number | The y position of the image in pixels | 1 |
Transcription
Label | Property | Range | Usage | Obligation |
Number of classifications | .transcription.numClassifications | Number | The number of classifications needed for this image | 1 |
Subject set ID | .transcription.subjectSetId | SubjectSet_id | A pointer to the Zooniverse subject set ID to which this image belongs | 1 |
Status | .transcription.status | String | The transcription status of the image (i.e., to send, sent, recieved, finished, gold standard) | 1 |
Answer | .transcription.answer | Answer | The aggregated worker responses | 0 - 1 |
Answer
Label | Property | Range | Usage | Obligation |
Image type | .answer.type | String | The aggregated type of image from labeller responses | 0 - 1 |
Image label | .answer.label | String | The aggregated label from labeller responses | 0 - 1 |
Labeller responses | .answer.responses | List: Response | The list of labeller responses | 0 - 1 |
Responses
Label | Property | Range | Usage | Obligation |
Labeller ID | .labellerId | String | The Zooniverse ID of the labeller | 0 - 1 |
Image type | .type | String | The type of image the labeller indicated (i.e., not a word, partial word, single word, multiple words) | 1 |
Image label | .label | String | The word in the image indicated by the labeller | 1 |
Reference Table (Not included in schema)
| S. No. | Label | Property | Range | Example | Usage | Obligation | Comments |
|---|---|---|---|---|---|---|---|
| 1 | Anonymized File name | anonymizedImageFile |
| 0 - 1 | |||
| 2 | Location-based filename | locationBasedImageFile |
| 1 | |||
| 3 | Page number | numPage | xsd:string | 6.0, 1.5, etc. | A data element that designates the version of the format named in "Format Name" | 0 or 1 | |
| 4 | Locatiion of image | locationY | premis:hasMessageDigest? nfo:hashValue? fedora:digest? | xsd:string, or xsd:anyURI | May have more than one checksum using different algorithms (differentiated with either URN syntax or separate properties for each algorithm). | ||
| 5 | height | height | rdfs:Literal | Date of creation of the resource. | 0 or 1 | ||
| 6 | Image number in the file | numWord | rdfs:Literal | The last modification date | 0 or 1 | ||
| 7 | Label | locationX | xsd:string | A human-readable label or string that can be used as a simple surrogate for the resource | min 0, max unbounded | ||
| 8 | segmentType | - | - | - | - | 0-1 | |
| 9 | File Name | - | - | - | - | 01 | |
| 10 | width of the image | width | xsd:string | A human readable string that specifies the name of the manufacturer of the scanner | 0 - 1 | Recommended | |
| 11 | register Number | register | xsd:string | A human readable string that specifies the name of the model of the scanner | 0 - 1 | Recommended |