You are on page 1of 26

VDA401: 

Advanced Data Access


Patterns with ADO.NET 2.0

Julie Lerman
The Data Farm
jlerman@thedatafarm.com
About Me

• Independent Software Developer


• 20 years developing
• Microsoft .NET MVP
• ASPInsider
• INETA Speaker
• Various publications & conferences
• Blogs: thedatafarm.com/blog, blog.ziffdavis.com/devlife
• Founder and leader of Vermont .NET
• INETA Advisory Board Member
Agenda

• ADO.NET 2.0 DataSet performance


• Pushing huge data back to SQL Server
• Caching huge amounts of data
• Querying huge amounts of data in memory
ADO.NET 2.0 Features for this talk

• Batch Updating with DataAdapter


• SqlBulkCopy Class
• SqlDependency Class
ADO.NET 2.0 Engine Improvements

• DataSet Scalability across the board


● Can handle lots of rows
• Indexing Engine completely re-written
● Incremental indexing updates
● much more
• Low overhead for using Data Views
• Serialization: Faster, more compact
● Supports Binary & Schema-less serialization
DEMO

Indexing Engine Performance


Moving Huge Amounts of Data

• Getting data is easy and fast!


• Uploading is a different story
● 1.1 Update Options
• DataAdapter.Update – one row at a time
• Controlled stored procedure update – one row at a
time
Updating in 2.0

• DataAdapter Batch Update


● Pushes groups of rows up to SQL Server
● SQL Server still updates one at a time
● Find sweet spot for UpdateBatchSize value
• Bulk Copy
● Streamed Insert using SQL Server Bulk Copy

*slide slightly modified from original printed version


DEMO

Batch Update & Bulk Copy Class


Best of all worlds – BCP & DML

• Use BCP to stream Inserts & Updates and


Deletes to temptables
• Create & run a DML TSQL command to
process inserts, updates and deletes from
the temptables into the real tables
• Wrap it all in a transaction
• End to End streaming with DataReader

DML: Database Manipulation Language


*slide slightly modified from original printed version
DEMO

Using BCL & DML to process huge


amounts of data
DataSet Caching

• DataSet improvements enable storing lots


of data into cache
• SqlDependency enables automated
notification of changes to .NET
• Combine SqlDependency and Caching for
serious resource reduction
DEMO

SqlDependency & DataSet Cache


Part 1
Big DataSet cache begs granularity

• SqlDependency is all or nothing


• Create separate dependencies for logical
sections of your cache
DEMO

SqlDependency & DataSet Cache


Part 2: Achieve granularity
Divide & Conquer your huge dataset

• In-Memory Querying is coming!


● LiNQ
● VB9/C#3
• What about today?
● ADO.NET 2.0 has everything you need to
build a simple query processor
● Great example of QP from ADO.NET Team
Technical Lead Pablo Castro (“for demo only”)
High Level view of Query Processor
• Familiar Query Syntax
● eg SELECT, WHERE, JOIN
• Query is parsed to “Algebra Tree”
● Some of the Algebraic Operators
• Scan: enumeration over the logical tree
• Projection: identifying what columns to return
• Filter, Join, Sort
● Just a data structure
• Iterator is created for each operator
● Does the actual work of processing the data and extracting what
is defined by operator
• Projection Iterator returns all the rows but only containing the
desired fields
• Filter iterator processes all the rows and returns only rows matching
filter rules
Roll your own Query Processor
Flattened to
cached match criteria
data

Project
Iterator
Join
cached
iterator
data
r
li te or
F at
r
Ite
Enhanced
Narrowed with related
down to data
specific
columns
Our Custom QP Iterators
• Scan
● Enumerates over dataset rows (eg leaf-nodes of tree)
• Select (Project)
● DataSet Expressions can pull out columns
• Filter
● DataTable.Select for on-the-fly indexing
● Consider DataView for a permanent index and it’s raw
.NET 2.0 power
• DataTable.Select can leverage this index
• Sort
● Bundle this with the filter
• leverage DataTable.Select or DataView and indexing
Use DataView’s new power for JOIN

• JOIN will be your weakest link in a query


operation
● Iterates through joined table for every row in
primary table
• Try a DataView instead
● DataView’s indexing will speed this up
Beware Operator Constraints

• DataView/DT.Select uses == and !=


● need to be creative for implementing things
like >, IN or BETWEEN
● Can’t leverage indexing
● Roll your own?
DEMO

A look at a sample Query


Processor written with
ADO.NET 2.0
Summary

• ADO.NET 2.0’s performance and


functional improvements put new power in
our hands
• You can work with huge amounts of data
efficiently
• Functionality like BatchUpdate gives you
easy access to some of this power
• With a little more code, you can do
wonders!
Resources

• PDC05 Session DAT408, Pablo Castro,


ADO.NET Team Technical Lead
• ADO.NET Team Blog:
blogs.msdn.com/dataaccess
• MSDN Data Access Dev Center
●msdn.microsoft.com/data
• What’s New in ADO.NET 2.0, Julia Lerman,
MSDN Magazine April 2005
ADO.NET 2.0 Bookshelf is filling up

• ADO.NET and System.Xml v 2.0 – The Beta Version:


Alex Homer, Dave Sussman & Mark Fussell, Addison-Wesley
• PRO ADO.NET 2.0, Sahil Malik, APress
• Professional ADO.NET 2: Programming with SQL Server
2005, Oracle, and MySQL, Wallace B. McClure, Gregory A.
Beamer, IV John J. Croft, J. Ambrose Little, Bill Ryan, Phil Winstanley,
David Yack, Jeremy Zongker, WROX Press
• Programming Microsoft ADO.NET 2.0 Applications:
Advanced Topics, Glenn Johnson, Microsoft Press
• Microsoft ADO.NET 2.0 Step by Step, Rebecca Riordan,
Microsoft Press
Contact Info

Julie Lerman

jlerman@thedatafarm.com
www.thedatafarm.com

Blogs
www.thedatafarm.com/blog
blogs.ziffdavis.com/devlife

You might also like