Week 1 Exercises
Week 1 Exercises
- Major issues RDBMSs need to concern itself with:
- Storing/modifying data and meta-data.
- Constraint definition/storage/maintenance/checking.
- Declarative manipulation of data (SQL).
- Extensibility via views, triggers, procedures.
- Query re-writing rules, optimization (indexes).
- Transaction processing, concurrency/recovery.
- Major stages in answering an SQL query:
- Obtain text string of SQL.
- Parsing and translation:
- Converts SQL query into relational algebra expression.
- Input: text string of SQL.
- Output: expression tree for relational algebra expression.
- Optimization:
- Converts RA expression into query plan.
- Input: relational algebra expression tree.
- Output: sequence of DBMS-specific relational operations.
- Execution:
- Performs relational operations, via chained intermediate results.
- Input: query plan (sequence of DBMS-specific relational operations).
- Output: set of result tuples, stored in memory or disk.
- Output:
- Convert result tuples into format useful for client.
- Input: tuple data in memory buffers (and possibly on disk as well).
- Output: stream of formatted tuples (format defined by library). .
- PostgreSQL is an “object-relational database management system”. Differences between PostgreSQL and conventional RDBMS:
- Every Postgres database tuple has an associated object identifier (oid).
- Postgres supports table inheritance:
create table cities ( name text, population real, altitude int ); create table capitals ( state char(2) ) inherits (cities); -- find all cities, both capital or not. select * from cities; -- find all cities, non-capitals only. select * from only cities; - Postgres supports defining new data types and operations on those types.
- Postgres supports function overloading:
create function test(int, real) returns ... create function test(smallint, double precision) returns ...
- Postgres scopes:
- Databases (or catalogs):
- Logical scope that*collects together a number of schemas.
- Details held in
pg_database.
- Schemas (or namespaces):
- Logical scope that contains a collection of database objects (tables, views, functions, etc).
- Has no effect on the way data is actually stored.
- Datails stored in
pg_namespace.
- Tablespaces:
- Physical scope identifying a region of the host filesystem where data files are stored.
- Has no effect on the logical organization of data.
- Details stored in
pg_tablespace.
- Databases (or catalogs):
- Command line arguments:
-l: Obtains a list of all databases.- Similar to meta-command
\list.
- Similar to meta-command
-f: Takes an SQL file as input and executes it.- Equivalent to meta-command
\i.
- Equivalent to meta-command
-a: Prints all nonempty input lines to standard output as they are read.- Equivalent to setting variable
ECHOtoall.
- Equivalent to setting variable
-E: Echos the actual queries generated by\dand other backslash commands.- Equivalent to setting variable
ECHO_HIDDENtoon.
- Equivalent to setting variable
- Difference between
insertandcopy:insertis standard SQL and takes a tuple of the required values. Eachinsertstatement inserts a single row and each row is inserted in separate transactions.copyis Postgres specific and takes a list of tab separated values to insert as rows. All the rows together form one single transaction.
Timing questions: a. The time variations are likely caused by caching. The results of recently executed queries are cached and hence the database does not need to access secondary storage. These cached results are much much faster to access because of this. b. There is no meaningful time value to ascribe. Both the cold and hot executions of the queries are useful for different purposes. c. If the contexts (hot or cold) is the same at the start of each timing, the results will be plausibly close. Tests should be run on the same lightly-loaded machine (to minimize differences caused by Unix buffering) and the tests should be the only user of the database server. If multiple users are competing for the buffer pool, the times could vary substantially, depending on how much of the buffered data had been swapped out to service queries from other users. d. For comparable executions of the query (either buffers empty or buffers fully-loaded), it looks like it’s no more accurate than +/- 10ms. It may be better to forget precise time measures, and simply fit queries into “ball-park” categories.
pg_catalogvsinformation_schemapg_catalog: a global schema that contains low level representation for sysadmins. Contains a set of tables and views in that schema (e.g.pg_catalog.pg_tables).View "pg_catalog.pg_tables" Column | Type | Collation | Nullable | Default -------------+---------+-----------+----------+--------- schemaname | name | | | tablename | name | | | tableowner | name | | | tablespace | name | | | hasindexes | boolean | | | hasrules | boolean | | | hastriggers | boolean | | | rowsecurity | boolean | | |information_schema: a system catalog available to users. Usually implemented as a collection of views on the native catalog schema (e.g.information_schema.tables).View "information_schema.tables" Column | Type | Collation | Nullable | Default ------------------------------+-----------------------------------+-----------+----------+--------- table_catalog | information_schema.sql_identifier | | | table_schema | information_schema.sql_identifier | | | table_name | information_schema.sql_identifier | | | table_type | information_schema.character_data | | | self_referencing_column_name | information_schema.sql_identifier | | | reference_generation | information_schema.character_data | | | user_defined_type_catalog | information_schema.sql_identifier | | | user_defined_type_schema | information_schema.sql_identifier | | | user_defined_type_name | information_schema.sql_identifier | | | is_insertable_into | information_schema.yes_or_no | | | is_typed | information_schema.yes_or_no | | | commit_action | information_schema.character_data | | |
Postgres OIDs are an (optional) system attribute hidden from the end user. These hidden attributes are not matched in standard SQL
select *and must be manually specified. Allpg_catalogtables contain an OID. Other system attributes includexminandxmaxwhich are NOT optional.- SQL view: list of table names and table OIDs from the public namespace.
select cls.oid, relname, nspname from pg_catalog.pg_class cls join pg_catalog.pg_namespace ns on cls.relnamespace = ns.oid where nspname = 'public'; create or replace function tablePath(tablename name) returns text as $$ declare table_id oid; tablespace_id oid; database_id oid; basepath text; begin -- Paths of the form: basepath/database_id/table_id select cls.oid, cls.reltablespace into table_id, tablespace_id from pg_class cls join pg_namespace nsp on cls.relnamespace = nsp.oid where cls.relname = tablename and cls.relkind = 'r' and nsp.nspname = 'public'; if tablespace_id = 0 then basepath := 'PGDATA/data'; else select spcname into basepath from pg_tablespace where oid = tablespace_id; if basepath is null then basepath := '???'; end if; end if; if table_id is null then return 'Table does not exist: ' || tablename; else select db.oid into database_id from pg_database db where db.datname = current_database(); return basepath || '/' || database_id::text || '/' || table_id::text; end if; end; $$ language plpgsql;create or replace function tableSchemas() returns setof text as $$ declare c pg_class%rowtype; a pg_attribute%rowtype; begin for c in ( select * from pg_class cls join pg_namespace nsp on cls.relnamespace = nsp.oid where cls.relkind = 'r' and nsp.nspname = 'public' order by relname ) loop return next ( select c.relname || '(' || string_agg(attname, ', ' order by attnum) || ')' from pg_attribute where attrelid = c.oid and attnum > 0 ); end loop; end; $$ language plpgsql;```sql create or replace function tableSchemas2() returns setof text as $$ declare c pg_class%rowtype; a pg_attribute%rowtype; begin – pg_attribute.atttypid: data type of this column, references pg_type.oid. – pg_attribute.atttypmod: – Records type specific data supplied at table creation time. – For example: maximum length of varchar. – Will generally by -1 for types that do not need this value.
for c in ( select * from pg_class cls join pg_namespace nsp on cls.relnamespace = nsp.oid where cls.relkind = ‘r’ and nsp.nspname = ‘public’ order by relname ) loop return next ( select c.relname || ‘(‘ || string_agg(attname || ‘:’ || typeString(atttypid, atttypmod), ‘, ‘ order by attnum) || ‘)’ from pg_attribute where attrelid = c.oid and attnum > 0 ); end loop; end; $$ language plpgsql;
create or replace function typeString(typid oid, typmod integer) returns text as \(declare typ text; begin typ := pg_catalog.format_type(typid,typmod); if (substr(typ,1,17) = 'character varying') then typ := replace(typ, 'character varying', 'varchar'); elsif (substr(typ,1,9) = 'character') then typ := replace(typ, 'character', 'char'); end if; return typ; end;\) language plpgsql;
14.
```sql
-- Update pg_attribute.atttypmod.
update pg_attribute
set atttypmod = N
from pg_class
where
pg_attribute.attrelid = pg_class.oid and
pg_class.relname = TableName and
attname = 'ColumnName';
