dedoc.converters
- class dedoc.converters.AbstractConverter(*, config: dict)[source]
This class provides the common methods for all converters: can_convert() and do_convert().
- __init__(*, config: dict) None[source]
- Parameters:
config – configuration of the converter, e.g. logger for logging
- abstract can_convert(extension: str, mime: str, parameters: dict | None = None) bool[source]
Check if this converter can convert file with the given extension.
- Parameters:
extension – file extension, for example .doc or .pdf
mime – MIME type of file
parameters – any additional parameters for given document
- Returns:
the indicator of possibility to convert this file
- abstract do_convert(tmp_dir: str, filename: str, extension: str) str[source]
Convert the given file to another format if it’s possible. This method can only be called on appropriate files, ensure that
can_convert()is True for the given file. If the file format is unsupported the ConversionException will be thrown.- Parameters:
tmp_dir – directory where the original file is located and where result will be saved
filename – name of the original file without extension
extension – extension of the original file
- Returns:
name of the converted file
- class dedoc.converters.FileConverterComposition(converters: List[AbstractConverter])[source]
This class allows to convert any document into the predefined list of formats according to the available list of converters. The list of converters is set via the class constructor. The first suitable converter is used (the one whose method
can_convert()returns True), so the order of converters is important.- __init__(converters: List[AbstractConverter]) None[source]
- Parameters:
converters – the list of converters that have methods can_convert() and do_convert(), they are used for files converting into specified formats
- do_converting(tmp_dir: str, filename: str, parameters: dict | None = None) str[source]
Convert file if there is the converter that can do it. If there isn’t any converter that is able to convert the file, it isn’t changed.
- Parameters:
tmp_dir – the directory where the file is located and where the converted file will be saved
filename – the name of the file to convert
parameters – parameters of converting
- Returns:
name of the converted file if conversion was executed else name of the original file
- class dedoc.converters.BinaryConverter(*, config: dict)[source]
Bases:
AbstractConverterConverts image-like documents with mime=application/octet-stream into PNG. Look to the
AbstractConverterdocumentation to get the information about the methods’ parameters.
- class dedoc.converters.DocxConverter(*, config: dict)[source]
Bases:
AbstractConverterConverts docx-like documents into DOCX using the soffice application. Look to the
AbstractConverterdocumentation to get the information about the methods’ parameters.
- class dedoc.converters.ExcelConverter(*, config: dict)[source]
Bases:
AbstractConverterConverts xlsx-like documents into XLSX using the soffice application. Look to the
AbstractConverterdocumentation to get the information about the methods’ parameters.
- class dedoc.converters.PptxConverter(*, config: dict)[source]
Bases:
AbstractConverterConverts pptx-like documents into PPTX using the soffice application. Look to the
AbstractConverterdocumentation to get the information about the methods’ parameters.
- class dedoc.converters.PDFConverter(*, config: dict)[source]
Bases:
AbstractConverterConverts pdf-like documents into PDF using the ddjvu application. Look to the
AbstractConverterdocumentation to get the information about the methods’ parameters.
- class dedoc.converters.PNGConverter(*, config: dict)[source]
Bases:
AbstractConverterConverts image-like documents into PNG. Look to the
AbstractConverterdocumentation to get the information about the methods’ parameters.
- class dedoc.converters.TxtConverter(*, config: dict)[source]
Bases:
AbstractConverterConverts txt-like documents into TXT by simple renaming. Look to the
AbstractConverterdocumentation to get the information about the methods’ parameters.