Skip to content

Conversation

@dzhuang
Copy link
Contributor

@dzhuang dzhuang commented May 5, 2016

Requirement:

  • latex
  • dvipng
  • dvisvgm

Texlive-full might contain all.

USAGE:

{% call latex() %}
The latex code
{% endcall %}

Optional args:

  • output_dir: full path or RELATE styled path of the file generated, default a directory named "latex_image" under MEDIA_ROOT.
  • tex_filename: the based filename of the latex, and image as well, if not set, use md5 of the full latex code
  • image_format: the output format of the image, only png and svg are available, and svg is the default. If set png, code with tikz/pgf will still using svg.
  • tex_preamble: the preamble other than what is provided in settings or default value in course.latex_utils. In the yml, it's value might be better to be set by using {% set foo %}{% endset %} block and the use tex_premable=foo as argument (same for tex_preamble_extra).
  • tex_preamble_extra: more packages or settings appended to default preamble.
  • overwrite: regenerate image if it exists, default: False(recommended).
  • html_class_extra: extra html class for the tag, besides img-responsive
  • alt: alt attribute for the tag. default value is the code in the call block.

A brief example is provided in inducer/relate-sample#4

@dzhuang
Copy link
Contributor Author

dzhuang commented May 6, 2016

Updated, and looking forward for your advices.

@inducer
Copy link
Owner

inducer commented May 6, 2016

Some comments from a first read:

  • I'm not sure I like that the preamble is configurable per-site. That means course content would become site-specific, because different sites may use different preambles.
  • There's a security implication here. We're running a bunch of commands with potentially semi-trusted input. At least latex should be run with -no-shell-escape. What about \verbatiminput{/etc/passwd}?
  • tex2...I - PEP8
  • Why make this DVI-based? That seems backwards. Why not PDF?
  • Raising an exception because something is wrong with user-written content should not happen outside of validation. (Otherwise, say, the course page may not render any more, and it becomes hard to even update the content.)
  • I'm not sure I like the idea that this leaves permanent traces on the file system. Why not run in a temporary directory and blow that away when done. Also, as is you assume you know about every file produced by TeX, which is generally not true. If you'd like to cache, use django's cache mechanism.

@dzhuang
Copy link
Contributor Author

dzhuang commented May 7, 2016

About site-specific preamble. Actually, I had realized that problem when writing the code, as you can see I tried to make preamble page-specific in the code, and I need your advice on where (and probably how) to elegantly configure the preamble in git so that it can be course-specific.

About DVI-based, personally I prefer pdflatex over latex, but I not sure whether there are some tools which can convert pdf to image that are more convenient than dvi2png and dvi2svgm, which are included in Texlive-full, without the need to consider how to install across platform. Maybe inkscape or imagemagick?

Thank you for all the precious advices. I'll try to improve the snippet.

@dzhuang
Copy link
Contributor Author

dzhuang commented May 12, 2016

In fact, my latex compilation setup is somewhat huge, include many packages, custom commands, something like makeatletter, tikz settings, and many external image files. My idea follows:

  1. Configure all the needed path like preamble.tex, package.tex, config.tex and figure directory in the course git, use something like attribute.xml?
  2. Make git root of the course the default working directory, copy those files from repo to WD if they don't exist or don't match.
  3. Use temp directory in wd when compiling, and blow it away when done, as you advised.

In this way, I can merge my local latex project into the course, without keeping 2 different versions.

P.S., I've been trying these days to find a convenient crossplatform pdf2png/pdf2svg convertor, but failed, and I found this post also complainted about this issue and stay using dvipng and dvisvgm as the solution. Crossplatform is important for me, because I am developing on windows while deploy on linux.

Edit: I just find pdf2svg. Will have a test.

Result: pdf2svg is generating svg file much huge than dvisvgm, about 3-4 time bigger. Using example here, the pdf2svg generate a svg with size more than 9M, while dvisvgm's is 1.8M. When the svgs are converted to png (with svgexport), the former result in a file with 3.4M, while the latter is less than 900k.

I think currently, the best solution is use dvi as base, and convert svg to png if the size of the svg is not acceptable.

@dzhuang
Copy link
Contributor Author

dzhuang commented May 15, 2016

Now I think it's better for the latex code in a page independent of external resources, because if those external settings changed, the previously generated images might fail. The best practice is to offer the full document in the page (at least all elements can be assembled in that page).

@dzhuang
Copy link
Contributor Author

dzhuang commented May 16, 2016

I now accept that we should use pdf as base to generate the image, as least as an option. From my experience these days, I found it painful in configuring the fonts and character set on the linux server, just for the purpose of displaying Chinese characters.

Svg is a beautiful format (although a cons for svg is that it require larger cache), but use pdf to generate svg is hard, at least for now. However, using dvi as base will bring problems in the future. For example, if I want to migrate data from one server to another, I'll have to do the complex configuration again, that will also be a high barrier for new user.

My current solution is:

  1. Add pdflatex and xelatex compilers, and leave latex dvi as an option, as xelatex is easier to configure in terms of fonts and character set.
  2. Because use a different compiler to compile the same code might result in failure, user must specify which compiler is used, and there should be no default compiler either at site level or course level. And that also leave room for more compiler to be used in the future.
  3. Use imagemagick to convert PDFs to png, although the generated images are not scalable and not looking good, but it's really a life saver. Moreover, the option to generate svg from dvi is preserved, leaving possibilities to the generate svgs.
  4. Another thing, don't put source code in alt, because Mathjax think \begin{tikzpicture} \end{tikzpicture} as math expressions, and will make it a mess while not display the valid images.

@inducer I need your advice on this, because I am in urgent need to use this feature. As I've already use this feature on my instance, and the pages are growing. I think I need to rewrite some of the code, but before the number page is out of control. For now, I want to determine args/kwargs (required/optional) that I may use in the caller. My current idea is:

Required args:

  • compiler: string, the command line used to compile the tex file, currently available: xelatex, pdflatex and latex.

Optional:

  • output_dir: string, the name of the subfolder under MEDIA_ROOT/course_identifier/flow_name, where the generated image will be placed, and also the working dir of the compilation/conversion, default "".
  • tex_filename: string, the based filename of the latex, and image as well, if not set, use md5 of the full latex code
  • image_format: string, the output format of the image, only png and svg are available, and png is the default value. If compiler="latex", and if the code contains tikz/pgf , the image_format will be forced using svg.
  • tex_preamble: string, allow user to split the preamble from the full document. In the yml, it's value might better be set by using {% set foo %}{% endset %} block and the use tex_premable=foo as argument. That parameter can be saved in a jinja file like "latex.jinja" the course git, but it's not recommend, as changing the preamble will force the originally valid image to recompile, which might result in failure (same for tex_preamble_extra). The best practice is to keep full latex document, which might be changed in the future, in the page markup.
  • tex_preamble_extra: string, more packages or settings appended to tex_premable.
  • overwrite: boolean, if True, regenerate image if it exists, default: False(recommended).
  • html_class_extra: string, extra html class for the tag, besides img-responsive
  • alt: string, brief description of the image, default: tex_filename.

Currently, I have only used image_format kwarg, and I want to add the compiler as the first arg. All the other name of kwargs can be changed, If no more args are needed, I can set out to alter my code now. Looking forward for you reply, thanks.

@dzhuang
Copy link
Contributor Author

dzhuang commented May 31, 2016

UPDATE:

  1. Allow pdflatex/xelatex/latex compilation
  2. Use latexmk to determine the number of times needed for tex compilation.
  3. Use Imagemagick to convert PDFs to png.
  4. Use django check framework to check if the path/version of compilers and converters are correctly configured.
  5. Cache the result and error logs.

Requirement:

  • Texlive-full (TUG version Texlive is prefered since it contains all latest packages, noticing that requires RELATE_LATEX_BIN_DIR configuration in settings for linux server.)
  • ImageMagick (for windows, that requires RELATE_IMAGEMAGIC_BIN_DIR configuration in settings.)

USAGE:

{% call latex(compiler="pdflatex", image_format="png") %}
The latex code
{% endcall %}

Required args:

  • compiler: string, the command line used to compile the tex file, currently available: xelatex, pdflatex and latex.
  • image_format: string, the output format of the image, only png and svg are available for now. If compiler="latex", and if the code contains tikz/pgf , the image_format will be forced using svg.

Optional:

  • tex_filename: string, the based filename of the latex, and image as well, if not set, use md5 of the full latex code
  • tex_preamble: string, allow user to split the preamble from the full document. In the yml, it's value might be better to be set by using {% set foo %}{% endset %} block and the use tex_premable=foo as argument. That parameter can be saved as a macro in the course git, but it's not recommend, as changing the preamble will force the originally valid image to recompile, which might result in failure (same for tex_preamble_extra). The best practice is to keep full latex document, which might be changed in the future, in the page markup.
  • tex_preamble_extra: string, more packages or settings appended to tex_premable.
  • force_regenerate: boolean, if True, regenerate image if it exists, default: False(recommended).
  • html_class_extra: string, extra html class for the <img> tag, besides img-responsive
  • alt: string, brief description of the image, default: the document part of the tex source.

Example is also updated in inducer/relate-sample#4

dzhuang added a commit to dzhuang/relate-sample that referenced this pull request May 31, 2016
@dzhuang
Copy link
Contributor Author

dzhuang commented Jun 2, 2016

UPDATE:

{{, }}, {%, %}, {# and #} are used in jinja template, if the calling latex code contains those string, jinja will fail to render. The work around is to manually insert a space (spaces or tabs) between the two character (e.g., {{ --> { {) for each of those strings appeared in latex code, and then before compiling the latex source, remove the spaces (tabs) added.

relate_course_tag = 'relate_course_tag'

@register(Tags.relate_course_tag)
@register(Tags.relate_course_tag, deploy=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be duplicated?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying it's wrong, in fact I don't know--the duplication just looks fishy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I'll correct it.

@inducer
Copy link
Owner

inducer commented Jun 30, 2016

  • This needs some documentation (along the lines of what you included above) to be in the .rst file.
  • How does this manage temporary files? I couldn't find that in the code, but it's very important to do that in a secure and race-free manner.

@dzhuang
Copy link
Contributor Author

dzhuang commented Jul 1, 2016

For temporary files, I think they are created here, and are removed using _remove_working_dir defined here.

@dzhuang
Copy link
Contributor Author

dzhuang commented Aug 29, 2016

It has been 2 month since last commit, how time flies. Sorry, I'll return to this asap.

@dzhuang
Copy link
Contributor Author

dzhuang commented Sep 2, 2016

Update:

  • Add documentation
  • Enable lualatex

@dzhuang dzhuang force-pushed the latex2img branch 2 times, most recently from edd7547 to 48839a8 Compare September 2, 2016 01:59

try:
log = get_abstract_latex_log(log)
_file_write(self.errlog_saving_path, log)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk of race condition.

))

try:
shutil.copyfile(image_path, self.image_saving_path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk of race condition.

Copy link
Contributor Author

@dzhuang dzhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use MongoDB as storage instead of filesystem, to store converted dataURI, so as to avoid race conditions. Passed mypy test.


def _file_read(filename):
'''Read the content of a file and close it properly.'''
f = file(filename, 'rb')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not py3 compatible.

@dzhuang dzhuang force-pushed the latex2img branch 3 times, most recently from ac38dbf to cb2056e Compare April 10, 2017 08:37
@dzhuang dzhuang force-pushed the master branch 2 times, most recently from d501a29 to 3c0900e Compare February 7, 2018 15:39
Base automatically changed from master to main March 8, 2021 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants