CLAM Formats¶
-
class
clam.common.formats.
AlpinoXMLFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'text/xml'¶
-
name
= 'Alpino XML'¶
-
schema
= ''¶
-
-
class
clam.common.formats.
BinaryDataFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'application/octet-stream'¶
-
name
= 'Application-specific Binary Data'¶
-
-
class
clam.common.formats.
CSVFormat
(file, **kwargs)¶ -
attributes
= {'encoding': StringParameter encoding, 'language': StringParameter language}¶
-
mimetype
= 'text/csv'¶
-
name
= 'Comma Separated Values'¶
-
-
class
clam.common.formats.
DCOIFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'text/xml'¶
-
name
= 'DCOI format'¶
-
schema
= ''¶
-
-
class
clam.common.formats.
DjVuFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'image/x-djvu'¶
-
name
= 'DjVu format'¶
-
-
class
clam.common.formats.
ExampleFormat
(file, **kwargs)¶ This is an Example format, please inspect its source code if you want to create custom formats!
-
allowcustomattributes
= True¶
-
attributes
= {}¶
-
httpheaders
()¶ HTTP headers to output for this format. Yields (key,value) tuples.
-
mimetype
= 'text/plain'¶
-
schema
= None¶
-
validator
()¶ Implement your validator here, should return True or False. Additionaly, if there is metadata IN the actual file, this method should extract it and assign it to this object. Will be automatically called from constructor. Note that the file (CLAMFile) is accessible through self.file, which is guaranteerd to exist when this method is called.
-
-
class
clam.common.formats.
FoLiAXMLFormat
(file, **kwargs)¶ -
attributes
= {'chunk-annotation': StringParameter chunk-annotation, 'entity-annotation': StringParameter entity-annotation, 'lemma-annotation': StringParameter lemma-annotation, 'paragraph-annotation': StringParameter paragraph-annotation, 'pos-annotation': StringParameter pos-annotation, 'relation-annotation': StringParameter relation-annotation, 'sense-annotation': StringParameter sense-annotation, 'sentence-annotation': StringParameter sentence-annotation, 'syntax-annotation': StringParameter syntax-annotation, 'text-annotation': StringParameter text-annotation, 'token-annotation': StringParameter token-annotation, 'version': StringParameter version}¶
-
mimetype
= 'text/xml'¶
-
name
= 'FoLiA XML'¶
-
schema
= ''¶
-
validator
()¶ This method can be overriden on derived classes and has no implementation here, should return True or False. Additionaly, if there is metadata IN the actual file, this method should extract it and assign it to this object. Will be automatically called from constructor. Note that the file (CLAMFile) is accessible through self.file, which is guaranteerd to exist when this method is called.
-
-
class
clam.common.formats.
FrogTSVFormat
(file, **kwargs)¶ -
attributes
= {'chunking': ChoiceParameter chunking, 'lemmatisation': ChoiceParameter lemmatisation, 'morphologicalanalysis': ChoiceParameter morphologicalanalysis, 'mwudetection': ChoiceParameter mwudetection, 'namedentities': ChoiceParameter namedentities, 'parsing': ChoiceParameter parsing, 'postagging': ChoiceParameter postagging, 'tokenisation': StaticParameter tokenisation: yes}¶
-
mimetype
= 'text/plain'¶
-
name
= 'Frog Tab Separated Values'¶
-
-
class
clam.common.formats.
GifImageFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'image/gif'¶
-
name
= 'Gif Image'¶
-
-
class
clam.common.formats.
HTMLFormat
(file, **kwargs)¶ HTML Format Definition. This format has one required attribute: encoding
-
attributes
= {'encoding': StringParameter encoding, 'language': StringParameter language}¶
-
httpheaders
()¶ HTTP headers to output for this format. Yields (key,value) tuples.
-
mimetype
= 'text/html'¶
-
name
= 'HTML Format'¶
-
-
class
clam.common.formats.
JSONFormat
(file, **kwargs)¶ -
mimetype
= 'application/json'¶
-
name
= 'JSON Format (generic, not further specified)'¶
-
-
class
clam.common.formats.
JpegImageFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'image/jpeg'¶
-
name
= 'Jpeg Image'¶
-
-
class
clam.common.formats.
KBXMLFormat
(file, **kwargs)¶ -
mimetype
= 'text/xml'¶
-
name
= 'Koninklijke Bibliotheek XML-formaat'¶
-
schema
= ''¶
-
-
class
clam.common.formats.
MP3AudioFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'audio/mpeg'¶
-
name
= 'MP3 Audio File'¶
-
-
class
clam.common.formats.
MSWordFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'application/msword'¶
-
name
= 'Microsoft Word format'¶
-
schema
= ''¶
-
-
class
clam.common.formats.
MpegVideoFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'video/mpeg'¶
-
name
= 'Mpeg Video'¶
-
-
class
clam.common.formats.
OggAudioFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'audio/vorbis'¶
-
name
= 'Ogg Vorbis Audio File'¶
-
-
class
clam.common.formats.
OggVideoFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'audio/ogg'¶
-
name
= 'Ogg Video File'¶
-
-
class
clam.common.formats.
OpenDocumentTextFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'application/vnd.oasis.opendocument.text'¶
-
name
= 'Open Document Text Format'¶
-
-
class
clam.common.formats.
PDFFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'application/pdf'¶
-
name
= 'PDF'¶
-
-
class
clam.common.formats.
PlainTextFormat
(file, **kwargs)¶ Plain Text Format Definition. This format has one required attribute: encoding
-
attributes
= {'encoding': StringParameter encoding, 'language': StringParameter language}¶
-
httpheaders
()¶ HTTP headers to output for this format. Yields (key,value) tuples.
-
mimetype
= 'text/plain'¶
-
name
= 'Plain Text Format'¶
-
-
class
clam.common.formats.
PngImageFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'image/png'¶
-
name
= 'PNG Image'¶
-
-
class
clam.common.formats.
TICCLShadowOutputXML
(file, **kwargs)¶ -
mimetype
= 'text/xml'¶
-
name
= 'Ticcl Shadow Output'¶
-
schema
= ''¶
-
-
class
clam.common.formats.
TICCLVariantOutputXML
(file, **kwargs)¶ -
mimetype
= 'text/xml'¶
-
name
= 'Ticcl Variant Output'¶
-
schema
= ''¶
-
-
clam.common.formats.
TadpoleFormat
¶ alias of
clam.common.formats.FrogTSVFormat
-
class
clam.common.formats.
TiffImageFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'image/tiff'¶
-
name
= 'Tiff Image'¶
-
-
clam.common.formats.
UndefinedXMLFormat
¶ alias of
clam.common.formats.XMLFormat
-
class
clam.common.formats.
WaveAudioFormat
(file, **kwargs)¶ -
attributes
= {}¶
-
mimetype
= 'audio/vnd.wave'¶
-
name
= 'Wave Audio File'¶
-
-
class
clam.common.formats.
XMLFormat
(file, **kwargs)¶ -
mimetype
= 'text/xml'¶
-
name
= 'XML Format (generic, not further specified)'¶
-
schema
= ''¶
-