Skip to content

Conversation

@RubenCerna2079
Copy link
Contributor

@RubenCerna2079 RubenCerna2079 commented Jan 27, 2026

Why make this change?

What is this change?

  • IQueryBuilder & MsSqlQueryBuilder: Adds new function that builds the query that returns all of the tables that will be turned into entities.
  • SqlMetadataProvider & MsSqlMetadataProvider: Adds function that executes query and receives the information for all of the existing autoentities. The information that is received for each table is the schema, object, and entity name, an example of this would be dbo, book, dbo.book. The query also transforms the reserved words {schema} and {object} into the tables schema and object and returns it in the name.
  • SqlMetadataProviderUnitTests: Adds testing to ensure that the query returns the expected output.

How was this tested?

  • Integration Tests
  • Unit Tests
  • Manual Testing

@RubenCerna2079 RubenCerna2079 added this to the Jan 2026 milestone Jan 27, 2026
@RubenCerna2079 RubenCerna2079 self-assigned this Jan 27, 2026
@RubenCerna2079 RubenCerna2079 linked an issue Jan 27, 2026 that may be closed by this pull request
@RubenCerna2079 RubenCerna2079 changed the title Create and execute query for Autoentities Create and execute query to return tables and interpolated names for Autoentities Jan 27, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds MSSQL support for querying autoentities include/exclude patterns and interpolated entity names, and wires this into metadata initialization with unit test coverage.

Changes:

  • Added a new BuildGetAutoentitiesQuery API to the query builder abstraction and implemented it for MSSQL.
  • Added MSSQL metadata-provider methods to execute the autoentities query during initialization and expose results for tests.
  • Added unit tests intended to validate autoentities query output and fixed a typo in MsSqlQueryExecutor method name.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/Service.Tests/UnitTests/SqlMetadataProviderUnitTests.cs Adds unit test for autoentities query output (currently has correctness/compilation issues).
src/Core/Services/MetadataProviders/SqlMetadataProvider.cs Calls into autoentities generation during MSSQL initialization.
src/Core/Services/MetadataProviders/MsSqlMetadataProvider.cs Adds autoentities query execution hooks (currently discards results).
src/Core/Resolvers/MsSqlQueryExecutor.cs Fixes ConfigureMsSqlQueryExecutor method spelling.
src/Core/Resolvers/MsSqlQueryBuilder.cs Adds MSSQL T-SQL batch to fetch eligible tables and interpolated entity names (currently unsafe/invalid).
src/Core/Resolvers/IQueryBuilder.cs Adds new autoentities query builder API as a default interface method.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 307 to +313
public async Task InitializeAsync()
{
System.Diagnostics.Stopwatch timer = System.Diagnostics.Stopwatch.StartNew();
if (GetDatabaseType() == DatabaseType.MSSQL)
{
await GenerateAutoentitiesIntoEntities();
}
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InitializeAsync() now calls GenerateAutoentitiesIntoEntities() before checking _isValidateOnly. In validate-only mode, initialization is expected to avoid extra schema/metadata work; this change will still run autoentities queries against the DB and could fail validation runs for reasons unrelated to connection validation.

Consider skipping GenerateAutoentitiesIntoEntities() when _isValidateOnly is true (or moving it after the validate-only early return).

Copilot uses AI. Check for mistakes.
Comment on lines 572 to 595
public string BuildGetAutoentitiesQuery(string include, string exclude, string namePattern)
{
string query = $"DECLARE @include_pattern NVARCHAR(MAX) = '{include}' DECLARE @exclude_pattern NVARCHAR(MAX) = '{exclude}' DECLARE @name_pattern NVARCHAR(255) = '{namePattern}' " +
"DECLARE @exclude_invalid_types BIT = 1 SET NOCOUNT ON; WITH include_patterns AS ( SELECT LTRIM(RTRIM(value)) AS pattern " +
"FROM STRING_SPLIT(ISNULL(@include_pattern, N''), N',') WHERE LTRIM(RTRIM(value)) <> N'' ), " +
"exclude_patterns AS ( SELECT LTRIM(RTRIM(value)) AS pattern FROM STRING_SPLIT(ISNULL(@exclude_pattern, N''), N',') " +
"WHERE LTRIM(RTRIM(value)) <> N'' ), all_tables AS ( SELECT s.name AS schema_name, t.name AS object_name, s.name + N'.' " +
"+ t.name AS full_name, N'table' AS object_type, t.object_id FROM sys.tables AS t JOIN sys.schemas AS s ON t.schema_id = s.schema_id " +
"WHERE EXISTS ( SELECT 1 FROM sys.key_constraints AS kc WHERE kc.parent_object_id = t.object_id AND kc.type = 'PK' ) ), eligible_tables AS " +
"( SELECT o.schema_name, o.object_name, o.full_name, o.object_type, o.object_id, CASE WHEN so.is_ms_shipped = 1 THEN 1 WHEN o.schema_name IN " +
"(N'sys', N'INFORMATION_SCHEMA') THEN 1 WHEN o.object_name IN ( N'__EFMigrationsHistory', N'__MigrationHistory', N'__FlywayHistory', N'sysdiagrams' ) THEN 1 " +
"WHEN o.object_name LIKE N'service_broker_%' THEN 1 WHEN o.object_name LIKE N'queue_messages_%' THEN 1 WHEN o.object_name LIKE N'MSmerge_%' " +
"THEN 1 WHEN o.object_name LIKE N'MSreplication_%' THEN 1 WHEN o.object_name LIKE N'FileTableUpdates$%' THEN 1 WHEN o.object_name LIKE N'graph_%' THEN 1 WHEN EXISTS " +
"( SELECT 1 FROM sys.tables AS t WHERE t.object_id = o.object_id AND ( t.is_tracked_by_cdc = 1 OR t.temporal_type > 0 OR t.is_filetable = 1 OR t.is_memory_optimized = 1 ) ) " +
"THEN 1 ELSE 0 END AS is_system_object FROM all_tables AS o JOIN sys.objects AS so ON so.object_id = o.object_id ) SELECT a.schema_name AS [schema], a.object_name AS [object], " +
"CASE WHEN LTRIM(RTRIM(ISNULL(@name_pattern, N''))) = N'' THEN a.object_name ELSE REPLACE( REPLACE(@name_pattern, N'{schema}', a.schema_name), N'{object}', a.object_name ) " +
"END AS entity_name, CASE WHEN EXISTS ( SELECT 1 FROM sys.columns AS c JOIN sys.types AS ty ON c.user_type_id = ty.user_type_id WHERE c.object_id = a.object_id AND ty.name IN " +
"( N'geography', N'geometry', N'hierarchyid', N'sql_variant', N'xml', N'rowversion', N'vector' ) ) THEN 1 ELSE 0 END AS contains_invalid_types FROM eligible_tables AS a WHERE " +
"a.is_system_object = 0 AND ( NOT EXISTS (SELECT 1 FROM exclude_patterns) OR NOT EXISTS ( SELECT 1 FROM exclude_patterns AS ep WHERE a.full_name LIKE ep.pattern " +
"COLLATE DATABASE_DEFAULT ESCAPE '\\' ) ) AND ( NOT EXISTS (SELECT 1 FROM include_patterns) OR EXISTS ( SELECT 1 FROM include_patterns AS ip WHERE a.full_name LIKE ip.pattern " +
"COLLATE DATABASE_DEFAULT ESCAPE '\\' ) ) AND ( @exclude_invalid_types = 0 OR NOT EXISTS ( SELECT 1 FROM sys.columns AS c JOIN sys.types AS ty ON c.user_type_id = ty.user_type_id " +
"WHERE c.object_id = a.object_id AND ty.name IN ( N'geography', N'geometry', N'hierarchyid', N'sql_variant', N'xml', N'rowversion', N'vector' ) ) ) ORDER BY a.schema_name, a.object_name;";

return query;
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BuildGetAutoentitiesQuery constructs a T-SQL batch by directly interpolating include/exclude/namePattern into string literals. This is vulnerable to SQL injection and also breaks for patterns containing single quotes. Additionally, the initial DECLARE ... statements are concatenated without statement terminators (e.g., ;), which will cause a syntax error near the second DECLARE.

Use query parameters instead of string interpolation (e.g., @include_pattern, @exclude_pattern, @name_pattern) and ensure statements are properly separated/terminated.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid point, please address it.

Comment on lines 631 to 643
foreach (string included in includeObject)
{
if (resultObject[1].ToString().Contains(included))
{
includedObjectExists = true;
Assert.AreEqual(expected: "dbo", actual: resultObject[0].ToString(), "Query does not return expected schema.");
Assert.AreNotEqual(name, resultObject[3].ToString(), "Name returned by query should not include {schema} or {object}.");
if (exclude.Length > 0)
{
Assert.IsTrue(!resultObject[1].ToString().Contains(excludeObject), "Query returns pattern that should be excluded.");
}
}
}
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This foreach loop implicitly filters its target sequence - consider filtering the sequence explicitly using '.Where(...)'.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@RubenCerna2079 RubenCerna2079 Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot I don't understand this comment

@RubenCerna2079
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@anushakolan anushakolan self-assigned this Jan 28, 2026
Comment on lines 572 to 595
public string BuildGetAutoentitiesQuery(string include, string exclude, string namePattern)
{
string query = $"DECLARE @include_pattern NVARCHAR(MAX) = '{include}' DECLARE @exclude_pattern NVARCHAR(MAX) = '{exclude}' DECLARE @name_pattern NVARCHAR(255) = '{namePattern}' " +
"DECLARE @exclude_invalid_types BIT = 1 SET NOCOUNT ON; WITH include_patterns AS ( SELECT LTRIM(RTRIM(value)) AS pattern " +
"FROM STRING_SPLIT(ISNULL(@include_pattern, N''), N',') WHERE LTRIM(RTRIM(value)) <> N'' ), " +
"exclude_patterns AS ( SELECT LTRIM(RTRIM(value)) AS pattern FROM STRING_SPLIT(ISNULL(@exclude_pattern, N''), N',') " +
"WHERE LTRIM(RTRIM(value)) <> N'' ), all_tables AS ( SELECT s.name AS schema_name, t.name AS object_name, s.name + N'.' " +
"+ t.name AS full_name, N'table' AS object_type, t.object_id FROM sys.tables AS t JOIN sys.schemas AS s ON t.schema_id = s.schema_id " +
"WHERE EXISTS ( SELECT 1 FROM sys.key_constraints AS kc WHERE kc.parent_object_id = t.object_id AND kc.type = 'PK' ) ), eligible_tables AS " +
"( SELECT o.schema_name, o.object_name, o.full_name, o.object_type, o.object_id, CASE WHEN so.is_ms_shipped = 1 THEN 1 WHEN o.schema_name IN " +
"(N'sys', N'INFORMATION_SCHEMA') THEN 1 WHEN o.object_name IN ( N'__EFMigrationsHistory', N'__MigrationHistory', N'__FlywayHistory', N'sysdiagrams' ) THEN 1 " +
"WHEN o.object_name LIKE N'service_broker_%' THEN 1 WHEN o.object_name LIKE N'queue_messages_%' THEN 1 WHEN o.object_name LIKE N'MSmerge_%' " +
"THEN 1 WHEN o.object_name LIKE N'MSreplication_%' THEN 1 WHEN o.object_name LIKE N'FileTableUpdates$%' THEN 1 WHEN o.object_name LIKE N'graph_%' THEN 1 WHEN EXISTS " +
"( SELECT 1 FROM sys.tables AS t WHERE t.object_id = o.object_id AND ( t.is_tracked_by_cdc = 1 OR t.temporal_type > 0 OR t.is_filetable = 1 OR t.is_memory_optimized = 1 ) ) " +
"THEN 1 ELSE 0 END AS is_system_object FROM all_tables AS o JOIN sys.objects AS so ON so.object_id = o.object_id ) SELECT a.schema_name AS [schema], a.object_name AS [object], " +
"CASE WHEN LTRIM(RTRIM(ISNULL(@name_pattern, N''))) = N'' THEN a.object_name ELSE REPLACE( REPLACE(@name_pattern, N'{schema}', a.schema_name), N'{object}', a.object_name ) " +
"END AS entity_name, CASE WHEN EXISTS ( SELECT 1 FROM sys.columns AS c JOIN sys.types AS ty ON c.user_type_id = ty.user_type_id WHERE c.object_id = a.object_id AND ty.name IN " +
"( N'geography', N'geometry', N'hierarchyid', N'sql_variant', N'xml', N'rowversion', N'vector' ) ) THEN 1 ELSE 0 END AS contains_invalid_types FROM eligible_tables AS a WHERE " +
"a.is_system_object = 0 AND ( NOT EXISTS (SELECT 1 FROM exclude_patterns) OR NOT EXISTS ( SELECT 1 FROM exclude_patterns AS ep WHERE a.full_name LIKE ep.pattern " +
"COLLATE DATABASE_DEFAULT ESCAPE '\\' ) ) AND ( NOT EXISTS (SELECT 1 FROM include_patterns) OR EXISTS ( SELECT 1 FROM include_patterns AS ip WHERE a.full_name LIKE ip.pattern " +
"COLLATE DATABASE_DEFAULT ESCAPE '\\' ) ) AND ( @exclude_invalid_types = 0 OR NOT EXISTS ( SELECT 1 FROM sys.columns AS c JOIN sys.types AS ty ON c.user_type_id = ty.user_type_id " +
"WHERE c.object_id = a.object_id AND ty.name IN ( N'geography', N'geometry', N'hierarchyid', N'sql_variant', N'xml', N'rowversion', N'vector' ) ) ) ORDER BY a.schema_name, a.object_name;";

return query;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid point, please address it.

/// <returns></returns>
public string BuildGetAutoentitiesQuery(string include, string exclude, string namePattern)
{
string query = $"DECLARE @include_pattern NVARCHAR(MAX) = '{include}' DECLARE @exclude_pattern NVARCHAR(MAX) = '{exclude}' DECLARE @name_pattern NVARCHAR(255) = '{namePattern}' " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire query is constructed as one giant C# interpolated string. This makes it extremely hard to read, review, or safely maintain. Would strongly recommend rewriting using a verbatim @$" … " string with proper indentation or move the SQL into a helper method/constant. This reduces noise from escaping and makes the structure readable.

public string BuildGetAutoentitiesQuery(string include, string exclude, string namePattern)
{
string query = $"DECLARE @include_pattern NVARCHAR(MAX) = '{include}' DECLARE @exclude_pattern NVARCHAR(MAX) = '{exclude}' DECLARE @name_pattern NVARCHAR(255) = '{namePattern}' " +
"DECLARE @exclude_invalid_types BIT = 1 SET NOCOUNT ON; WITH include_patterns AS ( SELECT LTRIM(RTRIM(value)) AS pattern " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The include and exclude lists both use the same logic (split → trim → remove empty values).
Right now this logic is duplicated in two different CTEs.
It would be cleaner and safer to put this logic in one shared place so both lists are processed the same way.
This avoids differences in behavior later if someone updates one list but forgets to update the other.

}
}

Assert.IsTrue(includedObjectExists, "Query does not return expected object.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test checks only includedObjectExists, which confirms that one expected object is present, but it does not verify that the full expected set is returned.
For multi‑include cases (e.g., {book, publish}) the query might return only book or might include extra objects, and the current test would still pass.
Consider adding a final SetEquals / CollectionAssert.AreEquivalent check to validate the complete result set and detect missing or extra objects.

@github-project-automation github-project-automation bot moved this from Todo to Review In Progress in Data API builder Jan 28, 2026
@RubenCerna2079
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Review In Progress

Development

Successfully merging this pull request may close these issues.

Query include & exclude properties for autoentities Add interpolation for schema/object

3 participants