CLAM Formats
- class clam.common.converters.CharEncodingConverter(id, **kwargs)
- acceptforinput = [<class 'clam.common.formats.PlainTextFormat'>]
- acceptforoutput = [<class 'clam.common.formats.PlainTextFormat'>]
- convertforinput(filepath, metadata=None)
Convert from target format into one of the source formats. Relevant if converters are used in InputTemplates. Metadata already is metadata for the to-be-generated file.
- convertforoutput(outputfile)
Convert from one of the source formats into target format. Relevant if converters are used in OutputTemplates. Outputfile is a CLAMOutputFile instance.
- label = 'CharEncodingConverter'
- class clam.common.converters.MSWordConverter(id, **kwargs)
- acceptforinput = [<class 'clam.common.formats.PlainTextFormat'>]
- convertforinput(filepath, metadata=None)
Convert from target format into one of the source formats. Relevant if converters are used in InputTemplates. Metadata already is metadata for the to-be-generated file. ‘filepath’ is both the source and the target file, the source file will be erased and overwritten with the conversion result!
- converttool = 'catdoc'
- class clam.common.converters.PDFtoHTMLConverter(id, **kwargs)
- acceptforinput = [<class 'clam.common.formats.HTMLFormat'>]
- convertforinput(filepath, metadata=None)
Convert from target format into one of the source formats. Relevant if converters are used in InputTemplates. Metadata already is metadata for the to-be-generated file. ‘filepath’ is both the source and the target file, the source file will be erased and overwritten with the conversion result!
- converttool = 'pdftohtml'
- class clam.common.converters.PDFtoTextConverter(id, **kwargs)
- acceptforinput = [<class 'clam.common.formats.PlainTextFormat'>]
- convertforinput(filepath, metadata=None)
Convert from target format into one of the source formats. Relevant if converters are used in InputTemplates. Metadata already is metadata for the to-be-generated file. ‘filepath’ is both the source and the target file, the source file will be erased and overwritten with the conversion result!
- converttool = 'pdftotext'