Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ For the editor container, you can also make it full width by adding `full-width-
| `MINIFY` | If true, all css & js will be minified before sending to the client. This will improve the loading performance massively, but makes it difficult to debug the javascript/css | `true` |
| `MAX_AGE` | How long may clients use served javascript code (in seconds)? Not setting this may cause problems during deployment. Set to 0 to disable caching. | `21600` (6 hours) |
| `SOFFICE` | Absolute path to the soffice (LibreOffice) executable. Needed for advanced import/export of pads (docx, pdf, odt). Setting it to null disables LibreOffice and will only allow plain text and HTML import/exports. | `null` |
| `NATIVE_DOCX_EXPORT` | Convert DOCX exports in-process with the bundled `html-to-docx` library instead of shelling out to LibreOffice. Auto-falls back to LibreOffice on error. Lets you skip installing `soffice` entirely for deployments that only need DOCX. | `false` |
| `ALLOW_UNKNOWN_FILE_ENDS` | Allow import of file types other than the supported ones: txt, doc, docx, rtf, odt, html & htm | `true` |
| `REQUIRE_AUTHENTICATION` | This setting is used if you require authentication of all users. Note: "/admin" always requires authentication. | `false` |
| `REQUIRE_AUTHORIZATION` | Require authorization by a module, or a user with is_admin set, see below. | `false` |
Expand Down
507 changes: 456 additions & 51 deletions pnpm-lock.yaml

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions settings.json.docker
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,12 @@
*/
"docxExport": "${DOCX_EXPORT:true}",

/*
* Convert DOCX exports in-process via html-to-docx instead of shelling
* out to LibreOffice. Auto-falls back to the LibreOffice path on error.
*/
"nativeDocxExport": "${NATIVE_DOCX_EXPORT:false}",

/*
* txt, doc, docx, rtf, odt, html & htm
*/
Expand Down
10 changes: 10 additions & 0 deletions settings.json.template
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,16 @@
*/
"docxExport": true,

/*
* Convert DOCX exports in-process with the bundled `html-to-docx` library
* rather than shelling out to LibreOffice. Skips the soffice dependency
* and removes per-export subprocess latency. If the in-process converter
* throws on a given pad, the export automatically falls back to the
* LibreOffice path — so turning this on is safe even on a deployment
* with soffice installed.
*/
"nativeDocxExport": false,

/*
* txt, doc, docx, rtf, odt, html & htm
*/
Expand Down
23 changes: 23 additions & 0 deletions src/node/handler/ExportHandler.ts
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,29 @@ exports.doExport = async (req: any, res: any, padId: string, readOnlyId: string,
return;
}

// Native DOCX path (issue #7538) — when `nativeDocxExport` is enabled,
// convert the HTML export into a Word document in-process with
// `html-to-docx` instead of shelling out to LibreOffice. Saves admins
// from having to install `soffice` and avoids per-export subprocess
// latency. On failure we fall through to the LibreOffice path below
// so the change is strictly additive (opt-in via setting, auto-fallback
// if the converter throws).
if (type === 'docx' && settings.nativeDocxExport) {
try {
const htmlToDocx = require('html-to-docx');
const docxBuffer = await htmlToDocx(html);
html = null;
res.contentType(
'application/vnd.openxmlformats-officedocument.wordprocessingml.document');
res.send(docxBuffer);
return;
} catch (err) {
console.warn(
`native-docx export failed for pad "${padId}", falling back to ` +
`LibreOffice: ${(err as Error).message || err}`);
}
Comment on lines +97 to +110
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Docx export still needs soffice 📎 Requirement gap ≡ Correctness

The new DOCX export path is opt-in (nativeDocxExport defaults to false) and explicitly falls
back to the existing LibreOffice/soffice path on error, so DOCX export is not fully free of a
LibreOffice runtime dependency. This fails the requirement to support DOCX export without requiring
LibreOffice for these formats.
Agent Prompt
## Issue description
Compliance requires DOCX export to work without a LibreOffice/`soffice` runtime dependency. The new native DOCX export is opt-in by default and explicitly falls back to the LibreOffice path on error, so LibreOffice is still required as a backstop for DOCX export.

## Issue Context
Current implementation uses `html-to-docx` when `nativeDocxExport` is enabled, but catches errors and falls through to LibreOffice. This violates the stated objective of having DOCX export not depend on LibreOffice.

## Fix Focus Areas
- src/node/handler/ExportHandler.ts[97-110]
- src/node/utils/Settings.ts[419-426]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

}
Comment on lines +90 to +111
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Docx blocked without soffice 🐞 Bug ≡ Correctness

Even with settings.nativeDocxExport=true, DOCX exports are rejected and hidden when settings.soffice
is null because exportAvailable() gates docx behind soffice in both the /export route and UI. This
makes the new native DOCX branch in ExportHandler unreachable in the documented no-LibreOffice
configuration.
Agent Prompt
## Issue description
Native DOCX export is implemented but is effectively unreachable in the intended “no soffice installed” configuration because the server route guard and client UI still treat `docx` as requiring LibreOffice.

## Issue Context
- Server-side guard blocks `docx` when `exportAvailable() === 'no'`.
- `exportAvailable()` currently only reflects `soffice` presence.
- Client UI removes the Word export link when `clientVars.exportAvailable === 'no'`.
- Docs say setting `SOFFICE` to `null` disables LibreOffice (typical for no-soffice deployments).

## Fix Focus Areas
- Update server export guard to allow `docx` when `settings.nativeDocxExport === true`, even if `soffice` is null:
  - src/node/hooks/express/importexport.ts[27-48]
- Add a dedicated capability flag for “Word export available” (or “nativeDocxExport enabled”) into clientVars so the UI can show Word export even when other converter-based exports remain disabled:
  - src/node/handler/PadMessageHandler.ts[1113-1118]
  - src/static/js/pad_impexp.ts[147-166]
- Avoid incorrectly enabling PDF/ODT links when only native DOCX is available (introduce a new state or separate flags rather than reusing `exportAvailable`).
  - src/node/utils/Settings.ts[700-709]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


// else write the html export to a file
const randNum = Math.floor(Math.random() * 0xFFFFFFFF);
const srcFile = `${tempDirectory}/etherpad_export_${randNum}.html`;
Expand Down
9 changes: 9 additions & 0 deletions src/node/utils/Settings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,7 @@ export type SettingsType = {
lang: string | null,
},
enableMetrics: boolean,
nativeDocxExport: boolean,
padShortcutEnabled: {
altF9: boolean,
altC: boolean,
Expand Down Expand Up @@ -415,6 +416,14 @@ const settings: SettingsType = {
* Wether to enable the /stats endpoint. The functionality in the admin menu is untouched for this.
*/
enableMetrics: true,
/**
* Convert DOCX exports in-process with the `html-to-docx` library instead
* of shelling out to LibreOffice / soffice (issue #7538). Opt-in: default
* `false` preserves the historical soffice behavior. When `true`, failures
* transparently fall back to the soffice path, so flipping this on is safe
* even on a LibreOffice-enabled deployment.
*/
nativeDocxExport: false,
/**
* Whether certain shortcut keys are enabled for a user in the pad
*/
Expand Down
1 change: 1 addition & 0 deletions src/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
"express-session": "^1.19.0",
"find-root": "1.1.0",
"formidable": "^3.5.4",
"html-to-docx": "^1.8.0",
"http-errors": "^2.0.1",
"jose": "^6.2.2",
"js-cookie": "^3.0.5",
Expand Down
51 changes: 51 additions & 0 deletions src/tests/backend/specs/export.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import {MapArrayType} from "../../../node/types/MapType";

const assert = require('assert').strict;
const common = require('../common');
const padManager = require('../../../node/db/PadManager');
import settings from '../../../node/utils/Settings';
Expand All @@ -13,6 +14,7 @@ describe(__filename, function () {
before(async function () {
agent = await common.init();
settingsBackup.soffice = settings.soffice;
settingsBackup.nativeDocxExport = settings.nativeDocxExport;
await padManager.getPad('testExportPad', 'test content');
});

Expand All @@ -22,7 +24,56 @@ describe(__filename, function () {

it('returns 500 on export error', async function () {
settings.soffice = 'false'; // '/bin/false' doesn't work on Windows
settings.nativeDocxExport = false;
await agent.get('/p/testExportPad/export/doc')
.expect(500);
});

// Issue #7538: in-process DOCX export via html-to-docx bypasses the
// soffice requirement entirely. A deployment with `soffice: false` and
// `nativeDocxExport: true` should still produce a working .docx.
describe('native DOCX export (#7538)', function () {
before(function () {
// The upgrade-from-latest-release CI job installs deps from the
// PREVIOUS release's package.json (before this PR adds html-to-docx)
// and then git-checkouts this branch's code without re-running
// `pnpm install`. Under that workflow the module isn't resolvable.
// Skip the block in that one case; regular backend tests (which
// install against this branch's lockfile) still exercise it.
try {
require.resolve('html-to-docx');
} catch {
this.skip();
return;
}
settings.soffice = 'false';
settings.nativeDocxExport = true;
});

it('returns a valid DOCX archive (PK zip signature)', async function () {
const res = await agent.get('/p/testExportPad/export/docx')
.buffer(true)
.parse((resp: any, callback: any) => {
const chunks: Buffer[] = [];
resp.on('data', (chunk: Buffer) => chunks.push(chunk));
resp.on('end', () => callback(null, Buffer.concat(chunks)));
})
.expect(200);
const body: Buffer = res.body as Buffer;
assert.ok(body.length > 0, 'DOCX body must not be empty');
// Word .docx files are ZIP archives — must start with the ZIP local
// file header signature 0x504b0304 ("PK\x03\x04").
assert.strictEqual(body[0], 0x50, 'byte 0 (P)');
assert.strictEqual(body[1], 0x4b, 'byte 1 (K)');
assert.strictEqual(body[2], 0x03, 'byte 2');
assert.strictEqual(body[3], 0x04, 'byte 3');
});

it('sends the Word-processing-ml content-type', async function () {
const res = await agent.get('/p/testExportPad/export/docx').expect(200);
assert.match(res.headers['content-type'],
/application\/vnd\.openxmlformats-officedocument\.wordprocessingml\.document/,
`unexpected content-type: ${res.headers['content-type']}`);
});
});
});
Loading