add a latex macro to convert Tex code to <img> tag using dataURI. #207

dzhuang · 2016-05-05T14:57:04Z

Requirement:

latex
dvipng
dvisvgm

Texlive-full might contain all.

USAGE:

{% call latex() %}
The latex code
{% endcall %}

Optional args:

output_dir: full path or RELATE styled path of the file generated, default a directory named "latex_image" under MEDIA_ROOT.
tex_filename: the based filename of the latex, and image as well, if not set, use md5 of the full latex code
image_format: the output format of the image, only png and svg are available, and svg is the default. If set png, code with tikz/pgf will still using svg.
tex_preamble: the preamble other than what is provided in settings or default value in course.latex_utils. In the yml, it's value might be better to be set by using {% set foo %}{% endset %} block and the use tex_premable=foo as argument (same for tex_preamble_extra).
tex_preamble_extra: more packages or settings appended to default preamble.
overwrite: regenerate image if it exists, default: False(recommended).
html_class_extra: extra html class for the tag, besides img-responsive
alt: alt attribute for the tag. default value is the code in the call block.

A brief example is provided in inducer/relate-sample#4

dzhuang · 2016-05-06T09:54:11Z

Updated, and looking forward for your advices.

inducer · 2016-05-06T23:45:53Z

Some comments from a first read:

I'm not sure I like that the preamble is configurable per-site. That means course content would become site-specific, because different sites may use different preambles.
There's a security implication here. We're running a bunch of commands with potentially semi-trusted input. At least latex should be run with -no-shell-escape. What about \verbatiminput{/etc/passwd}?
tex2...I - PEP8
Why make this DVI-based? That seems backwards. Why not PDF?
Raising an exception because something is wrong with user-written content should not happen outside of validation. (Otherwise, say, the course page may not render any more, and it becomes hard to even update the content.)
I'm not sure I like the idea that this leaves permanent traces on the file system. Why not run in a temporary directory and blow that away when done. Also, as is you assume you know about every file produced by TeX, which is generally not true. If you'd like to cache, use django's cache mechanism.

dzhuang · 2016-05-07T12:12:51Z

About site-specific preamble. Actually, I had realized that problem when writing the code, as you can see I tried to make preamble page-specific in the code, and I need your advice on where (and probably how) to elegantly configure the preamble in git so that it can be course-specific.

About DVI-based, personally I prefer pdflatex over latex, but I not sure whether there are some tools which can convert pdf to image that are more convenient than dvi2png and dvi2svgm, which are included in Texlive-full, without the need to consider how to install across platform. Maybe inkscape or imagemagick?

Thank you for all the precious advices. I'll try to improve the snippet.

dzhuang · 2016-05-12T10:47:57Z

In fact, my latex compilation setup is somewhat huge, include many packages, custom commands, something like makeatletter, tikz settings, and many external image files. My idea follows:

Configure all the needed path like preamble.tex, package.tex, config.tex and figure directory in the course git, use something like attribute.xml?
Make git root of the course the default working directory, copy those files from repo to WD if they don't exist or don't match.
Use temp directory in wd when compiling, and blow it away when done, as you advised.

In this way, I can merge my local latex project into the course, without keeping 2 different versions.

P.S., I've been trying these days to find a convenient crossplatform pdf2png/pdf2svg convertor, but failed, and I found this post also complainted about this issue and stay using dvipng and dvisvgm as the solution. Crossplatform is important for me, because I am developing on windows while deploy on linux.

Edit: I just find pdf2svg. Will have a test.

Result: pdf2svg is generating svg file much huge than dvisvgm, about 3-4 time bigger. Using example here, the pdf2svg generate a svg with size more than 9M, while dvisvgm's is 1.8M. When the svgs are converted to png (with svgexport), the former result in a file with 3.4M, while the latter is less than 900k.

I think currently, the best solution is use dvi as base, and convert svg to png if the size of the svg is not acceptable.

dzhuang · 2016-05-15T05:02:45Z

Now I think it's better for the latex code in a page independent of external resources, because if those external settings changed, the previously generated images might fail. The best practice is to offer the full document in the page (at least all elements can be assembled in that page).

dzhuang · 2016-05-16T09:07:49Z

I now accept that we should use pdf as base to generate the image, as least as an option. From my experience these days, I found it painful in configuring the fonts and character set on the linux server, just for the purpose of displaying Chinese characters.

Svg is a beautiful format (although a cons for svg is that it require larger cache), but use pdf to generate svg is hard, at least for now. However, using dvi as base will bring problems in the future. For example, if I want to migrate data from one server to another, I'll have to do the complex configuration again, that will also be a high barrier for new user.

My current solution is:

Add pdflatex and xelatex compilers, and leave latex dvi as an option, as xelatex is easier to configure in terms of fonts and character set.
Because use a different compiler to compile the same code might result in failure, user must specify which compiler is used, and there should be no default compiler either at site level or course level. And that also leave room for more compiler to be used in the future.
Use imagemagick to convert PDFs to png, although the generated images are not scalable and not looking good, but it's really a life saver. Moreover, the option to generate svg from dvi is preserved, leaving possibilities to the generate svgs.
Another thing, don't put source code in alt, because Mathjax think \begin{tikzpicture} \end{tikzpicture} as math expressions, and will make it a mess while not display the valid images.

@inducer I need your advice on this, because I am in urgent need to use this feature. As I've already use this feature on my instance, and the pages are growing. I think I need to rewrite some of the code, but before the number page is out of control. For now, I want to determine args/kwargs (required/optional) that I may use in the caller. My current idea is:

Required args:

compiler: string, the command line used to compile the tex file, currently available: xelatex, pdflatex and latex.

Optional:

output_dir: string, the name of the subfolder under MEDIA_ROOT/course_identifier/flow_name, where the generated image will be placed, and also the working dir of the compilation/conversion, default "".
tex_filename: string, the based filename of the latex, and image as well, if not set, use md5 of the full latex code
image_format: string, the output format of the image, only png and svg are available, and png is the default value. If compiler="latex", and if the code contains tikz/pgf , the image_format will be forced using svg.
tex_preamble: string, allow user to split the preamble from the full document. In the yml, it's value might better be set by using {% set foo %}{% endset %} block and the use tex_premable=foo as argument. That parameter can be saved in a jinja file like "latex.jinja" the course git, but it's not recommend, as changing the preamble will force the originally valid image to recompile, which might result in failure (same for tex_preamble_extra). The best practice is to keep full latex document, which might be changed in the future, in the page markup.
tex_preamble_extra: string, more packages or settings appended to tex_premable.
overwrite: boolean, if True, regenerate image if it exists, default: False(recommended).
html_class_extra: string, extra html class for the tag, besides img-responsive
alt: string, brief description of the image, default: tex_filename.

Currently, I have only used image_format kwarg, and I want to add the compiler as the first arg. All the other name of kwargs can be changed, If no more args are needed, I can set out to alter my code now. Looking forward for you reply, thanks.

dzhuang · 2016-05-31T15:08:46Z

UPDATE:

Allow pdflatex/xelatex/latex compilation
Use latexmk to determine the number of times needed for tex compilation.
Use Imagemagick to convert PDFs to png.
Use django check framework to check if the path/version of compilers and converters are correctly configured.
Cache the result and error logs.

Requirement:

Texlive-full (TUG version Texlive is prefered since it contains all latest packages, noticing that requires RELATE_LATEX_BIN_DIR configuration in settings for linux server.)
ImageMagick (for windows, that requires RELATE_IMAGEMAGIC_BIN_DIR configuration in settings.)

USAGE:

{% call latex(compiler="pdflatex", image_format="png") %}
The latex code
{% endcall %}

Required args:

compiler: string, the command line used to compile the tex file, currently available: xelatex, pdflatex and latex.
image_format: string, the output format of the image, only png and svg are available for now. If compiler="latex", and if the code contains tikz/pgf , the image_format will be forced using svg.

Optional:

tex_filename: string, the based filename of the latex, and image as well, if not set, use md5 of the full latex code
tex_preamble: string, allow user to split the preamble from the full document. In the yml, it's value might be better to be set by using {% set foo %}{% endset %} block and the use tex_premable=foo as argument. That parameter can be saved as a macro in the course git, but it's not recommend, as changing the preamble will force the originally valid image to recompile, which might result in failure (same for tex_preamble_extra). The best practice is to keep full latex document, which might be changed in the future, in the page markup.
tex_preamble_extra: string, more packages or settings appended to tex_premable.
force_regenerate: boolean, if True, regenerate image if it exists, default: False(recommended).
html_class_extra: string, extra html class for the <img> tag, besides img-responsive
alt: string, brief description of the image, default: the document part of the tex source.

Example is also updated in inducer/relate-sample#4

dzhuang · 2016-06-02T08:33:33Z

UPDATE:

{{, }}, {%, %}, {# and #} are used in jinja template, if the calling latex code contains those string, jinja will fail to render. The work around is to manually insert a space (spaces or tabs) between the two character (e.g., {{ --> { {) for each of those strings appeared in latex code, and then before compiling the latex source, remove the spaces (tabs) added.

inducer · 2016-06-27T16:39:14Z

course/check.py

+    relate_course_tag = 'relate_course_tag'
+
+@register(Tags.relate_course_tag)
+@register(Tags.relate_course_tag, deploy=True)


Does this need to be duplicated?

I'm not saying it's wrong, in fact I don't know--the duplication just looks fishy.

You are right. I'll correct it.

inducer · 2016-06-30T16:42:03Z

This needs some documentation (along the lines of what you included above) to be in the .rst file.
How does this manage temporary files? I couldn't find that in the code, but it's very important to do that in a secure and race-free manner.

dzhuang · 2016-07-01T00:15:26Z

For temporary files, I think they are created here, and are removed using _remove_working_dir defined here.

dzhuang · 2016-08-29T10:12:20Z

It has been 2 month since last commit, how time flies. Sorry, I'll return to this asap.

dzhuang · 2016-09-02T00:51:29Z

Update:

Add documentation
Enable lualatex

Conflicts: requirements.txt

dzhuang · 2016-09-22T04:51:52Z

course/latex/converter.py

+
+            try:
+                log = get_abstract_latex_log(log)
+                _file_write(self.errlog_saving_path, log)


Risk of race condition.

dzhuang · 2016-09-22T04:52:14Z

course/latex/converter.py

+                ))
+
+        try:
+            shutil.copyfile(image_path, self.image_saving_path)


Risk of race condition.

Conflicts: course/content.py requirements.txt

dzhuang

Use MongoDB as storage instead of filesystem, to store converted dataURI, so as to avoid race conditions. Passed mypy test.

dzhuang · 2016-10-29T09:18:18Z

course/latex/utils.py

+
+def _file_read(filename):
+    '''Read the content of a file and close it properly.'''
+    f = file(filename, 'rb')


not py3 compatible.

dzhuang mentioned this pull request May 16, 2016

Discuss: it is a good idea to render latex generated images in flow page? #203

Open

dzhuang added a commit to dzhuang/relate-sample that referenced this pull request May 31, 2016

update along with inducer/relate#207

2635b2c

inducer reviewed Jun 27, 2016
View reviewed changes

dzhuang force-pushed the latex2img branch 2 times, most recently from edd7547 to 48839a8 Compare September 2, 2016 01:59

add a latex macro to convert Tex code to <img> tag using dataURI.

7eb9f9c

dzhuang force-pushed the latex2img branch from 548424c to 7eb9f9c Compare September 2, 2016 02:37

Merge remote-tracking branch 'remotes/upstream/master' into latex2img

c7fb203

Conflicts: requirements.txt

dzhuang commented Sep 22, 2016

View reviewed changes

course/latex/converter.py Outdated

))

try:

shutil.copyfile(image_path, self.image_saving_path)

Copy link

Contributor Author

dzhuang Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk of race condition.

dzhuang added 2 commits April 10, 2017 12:55

Merge remote-tracking branch 'remotes/upstream/master' into latex2img

ead7cb1

Conflicts: course/content.py requirements.txt

Merge branch 'master' of git://github.com/inducer/relate into latex2img

edb7ace

dzhuang commented Apr 10, 2017

View reviewed changes

dzhuang force-pushed the latex2img branch from 0075a1e to c38ac83 Compare April 10, 2017 06:16

dzhuang force-pushed the latex2img branch 3 times, most recently from ac38dbf to cb2056e Compare April 10, 2017 08:37

use mongodb to store the results.

5ccf9a5

dzhuang force-pushed the latex2img branch from cb2056e to 5ccf9a5 Compare April 10, 2017 08:44

dzhuang force-pushed the master branch 2 times, most recently from d501a29 to 3c0900e Compare February 7, 2018 15:39

dzhuang force-pushed the master branch from bc55a62 to d8817f3 Compare February 18, 2018 10:12

dzhuang force-pushed the master branch from 4efa222 to b780917 Compare March 4, 2018 19:18

dzhuang force-pushed the master branch from d08b9e0 to 8b30595 Compare April 20, 2018 20:53

Base automatically changed from master to main March 8, 2021 02:15

add a latex macro to convert Tex code to <img> tag using dataURI. #207

Are you sure you want to change the base?

add a latex macro to convert Tex code to <img> tag using dataURI. #207

Uh oh!

Conversation

dzhuang commented May 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhuang commented May 6, 2016

Uh oh!

inducer commented May 6, 2016

Uh oh!

dzhuang commented May 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhuang commented May 12, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhuang commented May 15, 2016

Uh oh!

dzhuang commented May 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhuang commented May 31, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhuang commented Jun 2, 2016

Uh oh!

inducer Jun 27, 2016

Choose a reason for hiding this comment

Uh oh!

inducer Jun 27, 2016

Choose a reason for hiding this comment

Uh oh!

dzhuang Jun 30, 2016

Choose a reason for hiding this comment

Uh oh!

inducer commented Jun 30, 2016

Uh oh!

dzhuang commented Jul 1, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhuang commented Aug 29, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzhuang commented Sep 2, 2016

Uh oh!

dzhuang Sep 22, 2016

Choose a reason for hiding this comment

Uh oh!

dzhuang Sep 22, 2016

Choose a reason for hiding this comment

Uh oh!

dzhuang left a comment

Choose a reason for hiding this comment

Uh oh!

dzhuang Oct 29, 2016

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dzhuang commented May 5, 2016 •

edited

Loading

dzhuang commented May 7, 2016 •

edited

Loading

dzhuang commented May 12, 2016 •

edited

Loading

dzhuang commented May 16, 2016 •

edited

Loading

dzhuang commented May 31, 2016 •

edited

Loading

dzhuang commented Jul 1, 2016 •

edited

Loading

dzhuang commented Aug 29, 2016 •

edited

Loading