Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions Document-Processing-toc.html
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,22 @@
<li>
<a href="/document-processing/data-extraction/smart-data-extractor/net/NuGet-Packages-Required">NuGet Packages Required</a>
</li>
<li>Getting Started
<ul>
<li>
<a href="/document-processing/data-extraction/smart-data-extractor/net/Extract-Data-in-ASP-NET-Core">ASP.NET Core</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-data-extractor/net/Extract-Data-in-ASP-NET-MVC">ASP.NET MVC</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-data-extractor/net/Extract-Data-in-Console">Console</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-data-extractor/net/Extract-Data-in-WPF">WPF</a>
</li>
</ul>
</li>
<li>
<a href="/document-processing/data-extraction/smart-data-extractor/net/Features">Features</a>
</li>
Expand All @@ -172,6 +188,22 @@
<li>
<a href="/document-processing/data-extraction/smart-table-extractor/net/NuGet-Packages-Required">NuGet Packages Required</a>
</li>
<li>Getting Started
<ul>
<li>
<a href="/document-processing/data-extraction/smart-table-extractor/net/Extract-Table-Data-in-ASP-NET-Core">ASP.NET Core</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-table-extractor/net/Extract-Table-Data-in-ASP-NET-MVC">ASP.NET MVC</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-table-extractor/net/Extract-Table-Data-in-Console">Console</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-table-extractor/net/Extract-Table-Data-in-WPF">WPF</a>
</li>
</ul>
</li>
<li>
<a href="/document-processing/data-extraction/smart-table-extractor/net/Features">Features</a>
</li>
Expand All @@ -197,6 +229,22 @@
<li>
<a href="/document-processing/data-extraction/smart-form-recognizer/net/nuGet-packages-required">NuGet Packages Required</a>
</li>
<li>Getting Started
<ul>
<li>
<a href="/document-processing/data-extraction/smart-form-recognizer/net/Recognize-Form-Data-in-ASP-NET-Core">ASP.NET Core</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-form-recognizer/net/Recognize-Form-Data-in-ASP-NET-MVC">ASP.NET MVC</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-form-recognizer/net/Recognize-Form-Data-in-Console">Console</a>
</li>
<li>
<a href="/document-processing/data-extraction/smart-form-recognizer/net/Recognize-Form-Data-in-WPF">WPF</a>
</li>
</ul>
</li>
<li>
<a href="/document-processing/data-extraction/smart-form-recognizer/net/working-with-recognize-option">Working With Recognize Options</a>
</li>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: Extract data in ASP.NET Core | Syncfusion
description: Learn how to extract data from pdf in ASP.NET Core with easy steps using Syncfusion .NET Core Data extraction library.
platform: document-processing
control: SmartDataExtractor
documentation: UG
---

# Extract Data in ASP.NET Core

The Syncfusion<sup>&reg;</sup> Smart Data Extractor is a .NET library used to extract structured data and document elements from PDFs and images in ASP.NET Core applications.

To include the .NET Core Smart Data Extractor library into your ASP.NET Core application, please refer to the [NuGet Package Required](https://help.syncfusion.com/document-processing/data-extraction/smart-data-extractor/net/nuget-packages-required) or [Assemblies Required](https://help.syncfusion.com/document-processing/data-extraction/smart-data-extractor/net/assemblies-required) documentation.


## Steps to Extract Data form PDF in ASP.NET Core application

{% tabcontents %}
{% tabcontent Visual Studio %}
{% include_relative tabcontent-support/Extract-Data-in-ASP-NET-Core-Visual-Studio.md %}
{% endtabcontent %}

{% tabcontent Visual Studio Code %}
{% include_relative tabcontent-support/Extract-Data-in-ASP-NET-Core-VS-Code.md %}
{% endtabcontent %}

{% endtabcontents %}

You can download a complete working sample from [GitHub]().

By executing the program, you will get the PDF document as follows.
![ASP.Net Core output JSON document](GettingStarted_images/ASPCore_Output.png)

Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
title: Extract Data in ASP.NET MVC Application | Syncfusion
description: Learn how to extract data in an ASP.NET MVC application with easy steps using the Syncfusion Smart Data Extractor library.
platform: document-processing
control: SmartDataExtractor
documentation: UG
keywords: Assemblies

---

# Extracting Data in ASP.NET MVC

The Syncfusion<sup>&reg;</sup> Smart Data Extractor is a .NET library used to extract structured data and document elements from PDFs and images in ASP.NET Core applications.

## Steps to Extract data from PDF document in ASP.NET MVC

Step 1: Create a new C# ASP.NET Web Application (.NET Framework) project.
![Create ASP.NET MVC application](GettingStarted_images/MVC_Creation.png)

Step 2: In the project configuration windows, name your project and select Create.
![Configuration window1](GettingStarted_images/MVC_Data1.png)
![Configuration window2](GettingStarted_images/MVC2.png)

Step 3: Install [Syncfusion.SmartDataExtractor.AspNet.Mvc5](https://www.nuget.org/packages/Syncfusion.SmartDataExtractor.AspNet.Mvc5) NuGet package as reference to your .NET applications from [NuGet.org](https://www.nuget.org/).
![NuGet package installation](GettingStarted_images/MVC_Data3.png)

Step 4: Include the following namespaces in the HomeController.cs file.

{% highlight c# tabtitle="C#" %}

using Syncfusion.SmartDataExtractor;
using System.IO;
using System.Text;

{% endhighlight %}

Step 5: Add a new button in the Index.cshtml as shown below.

{% highlight c# tabtitle="C#" %}

@{
ViewBag.Title = "Home Page";
}

<div style="margin-top:20px;">
@using (Html.BeginForm("ExtractData", "Home", FormMethod.Get))
{
<input type="submit" value="Extract Data from PDF" style="width:220px;height:30px" />
}
</div>

{% endhighlight %}

Step 6: Add a new action method named ExtractData in `HomeController.cs` and include the below code example to extract data from a PDF document using the **ExtractDataAsJson** method in the **DataExtractor** class.

{% highlight c# tabtitle="C#" %}

// Resolve the path to the input PDF file inside the App_Data folder.
string inputPath = Server.MapPath("~/App_Data/Input.pdf");

// Open the input PDF file as a stream.
using (FileStream stream = new FileStream(inputPath, FileMode.Open, FileAccess.ReadWrite))
{
// Initialize the Smart Data Extractor.
DataExtractor extractor = new DataExtractor();
// Extract form data as JSON.
string data = extractor.ExtractDataAsJson(stream);
// Convert JSON string into a MemoryStream for download.
MemoryStream outputStream = new MemoryStream(Encoding.UTF8.GetBytes(data));
// Reset stream position.
outputStream.Position = 0;
// Return JSON file as download in browser.
return File(outputStream, "application/json", "Output.json");
}

{% endhighlight %}

By executing the program, you will get the PDF document as follows.
![HTML to PDF output document](GettingStarted_images/ASPCore_Output.png)

A complete working sample can be downloaded from [Github](https://github.com/SyncfusionExamples/PDF-Examples/tree/master/Data-Extraction/Getting-Started/ASP.NETMVC/Extract_Data).

Click [here](https://www.syncfusion.com/document-sdk/net-pdf-data-extraction) to explore the rich set of Syncfusion<sup>&reg;</sup> Data Extraction library features.

Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
title: Extract Data in Console Application | Syncfusion
description: Learn how to extract data in a Console Application by using the Syncfusion Smart Data Extractor efficiently.
platform: document-processing
control: SmartDataExtractor
documentation: UG
---

# Extract Data from PDF in Console Application

The Syncfusion<sup>&reg;</sup> Smart Data Extractor is a .NET library used to extract structured data and document elements from PDFs and images in ASP.NET Core applications.

## Steps to Extract Data from PDF in Console App

{% tabcontents %}
{% tabcontent Visual Studio %}
{% include_relative tabcontent-support/Extract-Data-in-Console-Visual-Studio.md %}
{% endtabcontent %}

{% tabcontent Visual Studio Code %}
{% include_relative tabcontent-support/Extract-Data-in-Console-VS-Code.md %}
{% endtabcontent %}

{% endtabcontents %}

You can download a complete working sample from [GitHub](https://github.com/SyncfusionExamples/PDF-Examples/tree/master/Data-Extraction/Getting-Started/Console/.NET/Extract_Data_as_JSON).

By executing the program, you will get the PDF document as follows.
![Console output PDF document](GettingStarted_images/ASPCore_Output.png)

## Extract Data from PDF using .NET Framework

The following steps illustrates Extracting Data from PDF document in console application using .NET Framework.

**Prerequisites**:

* Install .NET SDK: Ensure that you have the .NET SDK installed on your system. You can download it from the [.NET Downloads page](https://dotnet.microsoft.com/en-us/download).
* Install Visual Studio: Download and install Visual Studio Code from the [official website](https://code.visualstudio.com/download).

**Steps to Extract Data from PDF using .NET Framework**

Step 1: Create a new C# Console Application (.NET Framework) project.
![Console Application creation](GettingStarted_images/ConsoleFramework.png)

Step 2: Name the project.
![Name the application](GettingStarted_images/ConsoleFrameworkName.png)

Step 3: Install the [Syncfusion.SmartDataExtractor.WinForms](https://www.nuget.org/packages/Syncfusion.SmartDataExtractor.WinForms/) NuGet package as reference to your .NET Standard applications from [NuGet.org](https://www.nuget.org).
![NET Framework NuGet package](GettingStarted_images/ConsoleNuget.png)

Step 4: Include the following namespaces in the *Program.cs*.

{% highlight c# tabtitle="C#" %}

using System.IO;
using Syncfusion.Pdf.Parsing;
using Syncfusion.SmartDataExtractor;

{% endhighlight %}

Step 5: Include the following code sample in *Program.cs* to Extract data from an PDF file.

{% highlight c# tabtitle="C#" %}

//Open the input PDF file as a stream.
using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess.Read))
{
//Initialize the Smart Data Extractor.
DataExtractor extractor = new DataExtractor();
//Extract data as JSON.
string data = extractor.ExtractDataAsJson(stream);
//Save the extracted JSON data into an output file.
File.WriteAllText("Output.json", data, Encoding.UTF8);
}

{% endhighlight %}

Step 6: Build the project.

Click on Build > Build Solution or press Ctrl + Shift + B to build the project.

Step 7: Run the project.

Click the Start button (green arrow) or press F5 to run the app.

You can download a complete working sample from [GitHub](https://github.com/SyncfusionExamples/PDF-Examples/tree/master/Data-Extraction/Getting-Started/Console/.NETFramework/Extract_Data).

By executing the program, you will get the PDF document as follows.
![Console output PDF document](GettingStarted_images/ASPCore_Output.png)


Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: Extract Data in WPF Application | Syncfusion
description: Learn how to extract data in a WPF application with easy steps using the Syncfusion Smart Data Extractor library.
platform: document-processing
control: Smart Data Extractor
documentation: UG
keywords: Assemblies

---

# Extract Data from PDF in WPF Application

The Syncfusion<sup>&reg;</sup> Smart Data Extractor is a .NET library used to extract structured data and document elements from PDFs and images in ASP.NET Core applications.

## Steps to Extract Data from PDF document in WPF

Step 1: Create a new WPF application project.
![Create WPF sample](GettingStarted_images/CreationWPF.png)

In project configuration window, name your project and select Create.
![WPF configuration window](GettingStarted_images/WPFDataName.png)

Step 2: Install the [Syncfusion.SmartDataExtractor.WPF](https://www.nuget.org/packages/Syncfusion.SmartDataExtractor.WPF) NuGet package as a reference to your WPF application [NuGet.org](https://www.nuget.org/).
![NuGet package installation](GettingStarted_images/WPFDataNuget.png)


Step 3: Include the following namespaces in the MainWindow.xaml.cs file.

{% highlight c# tabtitle="C#" %}

using Syncfusion.SmartDataExtractor;
using System;
using System.IO;
using System.Text;
using System.Windows;

{% endhighlight %}

Step 4: Add a new button in MainWindow.xaml to Extract data from PDF document as follows.

{% highlight c# tabtitle="C#" %}

<Grid>
<Button Content="Extract Data"
Width="150" Height="40"
HorizontalAlignment="Center"
VerticalAlignment="Center"
Click="ExtractButton_Click"/>
</Grid>

{% endhighlight %}

Step 5: Add the following code in `btnCreate_Click` to extract data from a PDF document using the **ExtractDataAsJson** method in the **DataExtractor** class. The extracted content will be saved as a JSON file

{% highlight c# tabtitle="C#" %}

// Open the input PDF file as a stream.
using (FileStream stream = new FileStream("Input.pdf", FileMode.Open, FileAccess.Read))
{
// Initialize the Smart Data Extractor.
DataExtractor extractor = new DataExtractor();
// Extract form data as JSON.
string data = extractor.ExtractDataAsJson(stream);
// Save the extracted JSON data into an output file (inline path).
File.WriteAllText("Output.json", data, Encoding.UTF8);
}

{% endhighlight %}

By executing the program, you will get the PDF document as follows.
![Convert HTMLToPDF WPF output](GettingStarted_images/ASPCore_Output.png)

A complete working sample can be downloaded from [Github](https://github.com/SyncfusionExamples/PDF-Examples/tree/master/Data-Extraction/Getting-Started/WPF/Extract_Data).

Click [here](https://www.syncfusion.com/document-sdk/net-pdf-data-extraction) to explore the rich set of Syncfusion<sup>&reg;</sup>Data Extraction library features.
Loading