ETL vs. ELT

Pros of ETL:

  • Can balance the workload / share the workload with the RDBMS
  • Can perform more complex operations in single data flow diagrams (data maps)
  • Can scale with separate hardware.
  • Can handle Partitioning & parallelism independent of the data model architecture.
  • Can process data in-stream, as it transfers from source to target
  • Does not require co-location of data sets in order to do it’s work.
  • Captures huge amounts of metadata lineage.

Pros of ELT:

  • Leverages RDBMS engine hardware for scalability
  • Keeps all data in the RDBMS all the time
  • Is parallelized according to the data set, and disk I/O is usually optimized at the engine level.
  • Scales as long as the hardware and RDBMS engine can continue to scale.
  • Can achieve 3x to 4x the throughput rates on the appropriately tuned MPP RDBMS platform.

Informatica ELT: In 2006 Informatica made an each way bet in the challenge by putting ELT style push down optimization into PowerCenter. We can see a very good list of 8.1.1 and Pushdown optimization Best practices.

ODI ELT: We can see the ELT in action at Mark Rittmans Sunopsis Data Conductor : Creating an Oracle Project and in Moving Global Electronics Data using Sunopsis he evaluates the Oracle SQL generated.

DataStage ELT: In DataStage ELT is available through running a SQL statement from a database stage or by calling a stored procedure. It can generate SQL statements but not SQL procedural code so it is not as flexible or advanced as PowerCenter or ODI.

Advertisements