DBeaver数据库操作软件,SQL语言
SELECT …… FROM…… AS
select 表内容 form 数据库名 order by asc/desc
升序或降序排列
只能出现一个selcet
从一个/多个表中检索数据。 select 是sql中的数据查询关键词 from 和 where是后缀的定位词 。 select from【<数据库名!>】<表名>【,【<数据库名!>】<表名>……】 where<连接条件> SELECT后跟想检索的内容,一/多个字段,即可以是字段名也可以是函数(系统自定义的);也可以是星号,表示输出表中所有的字段。 FROM是检索内容的来源,即表的名称。 WHERE是指定查询条件,把满足逻辑表达式的座位查询结果,是可选项。
SELECT …… FROM…… WHERE
如:SELECT 学号,成绩 FROM 学生选课 WHERE 学号 like ‘2006*’ 在学生选课表中查询学号以2006开头的学生的学号和成绩。
select distinct :去重操作,当多行重复时,只取出一行
如:
select DISTINCT
t1.send_stockorg_code as org_id,
t2.mng_org_name as mng_reg_id
from edw_ai.dim_stockorg_relationship as t1
left join
(select
store_code,
mng_org_name
from edw_ai.dim_mng_org_store) as t2
on t1.receive_stockorg_code = t2.store_code
##
两个应用,一个是SELECT…FROM…AS… 一个是,作为连接语句的操作符。 如sql中:create table tablename as select * from tablename2; 解释:先获取tablename表中的所有记录,之后创建一张tablename表,结构和tablename2表相同,记录为后面语句的查询结果。
WITH… AS…
with 公用表达式名字 as (select * from xxx where 条件) select * from 公用表达式名字 如:with A as (select * from class) select *from A 这个语句的意思就是,先执行select * from class 得到一个结果,将这个结果记录为A ,再执行select *from A 语句。A 表只是一个别名。 也就是将重复用到的大批量 的SQL语句,放到with as 中,加一个别名,在后面用到的时候就可以直接用。(也就是说,with as 类似于子函数)
对于大批量的SQL数据,起到优化的作用。
WITH的相关总结
with as 的使用范围仅限制在,with as 紧跟着的下一条SQL。之后with as 公用SQL失效。 1.使用with子句可以让子查询重用相同的with查询块,通过select调用(with子句只能被select查询块引用),一般在with查询用到多次情况下。在引用的select语句之前定义,同级只能定义with关键字只能使用一次,多个用逗号分割。 2.with子句的返回结果存到用户的临时表空间中,只做一次查询,反复使用,提高效率。 3.在同级select前有多个查询定义的时候,第1个用with,后面的不用with,并且用逗号隔开。 4.最后一个with 子句与下面的查询之间不能有逗号,只通过右括号分割,with 子句的查询必须用括号括起来。 5.如果定义了with子句,而在查询中不使用,那么会报ora-32035 错误:未引用在with子句中定义的查询名。(至少一个with查询的name未被引用,解决方法是移除未被引用的with查询),注意:只要后面有引用的就可以,不一定非要在主查询中引用,比如后面的with查询也引用了,也是可以的。 6.前面的with子句定义的查询在后面的with子句中可以使用。但是一个with子句内部不能嵌套with子句。 7.当一个查询块名字和一个表名或其他的对象相同时,解析器从内向外搜索,优先使用子查询块名字。 8.with查询的结果列有别名,引用的时候必须使用别名或*。
多表合并的时候,经常会用到 WITH xxx AS和JOIN命令,比如说:
JOIN
SELECT t1.product_code AS prod_id,
t1.color_code AS color_id,
t1.size_code AS size_id,
t1.size_order,
FROM edw_ai.dim_sku AS t1
JOIN (SELECT product_year, product_quarter
FROM edw_ai.config_target_product
WHERE day_date = '2019-04-15') AS t2
ON t1.sku_year = t2.product_year AND
t1.sku_quarter = t2.product_quarter
注释:
’‘’用product,color,size的code来代替其id 2.’’'在form的表里的2019-04-15一天里找到poduct_year及product_quarter 3.JOIN ON 语句几位连接t1和t2表
WITH……AS和JOIN一起使用,
WITH……AS相当于建立了一个临时表
WITH
prod_info_targ AS
(SELECT t1.product_code AS prod_id,
t1.color_code AS color_id
FROM edw_ai.dim_sku AS t1
JOIN (SELECT product_year, product_quarter
FROM edw_ai.config_target_product
WHERE day_date = '{0}') AS t2
ON t1.sku_year = t2.product_year AND
t1.sku_quarter = t2.product_quarter)
SELECT t1.product_code AS prod_id,
t1.color_code AS color_id,
t1.size_code AS size_id,
t1.size_code_num AS lfs
FROM edw.skc_main_size AS t1
JOIN prod_info_targ AS t2
ON t1.product_code = t2.prod_id
AND t1.color_code = t2.color_id
with
t1 as
(SELECT stockorg_code as store_id,
org_code AS mng_reg_id
FROM edw_ai.dim_stockorg),
t2 as
(SELECT send_stockorg_code as org_id,
receive_stockorg_code,
t1.store_id,
t1.mng_reg_id
FROM edw_ai.dim_stockorg_relationship
join t1
on t1.store_id = edw_ai.dim_stockorg_relationship.receive_stockorg_code
where move_type = '发货')
select store_id, mng_reg_id, org_id, receive_stockorg_code from t2
偶尔还会用到的命令:
COALESCE(t1.qty, 0)
这个函数主要用来进行空值处理。即返回其参数中第一个非空表达式,如果所有参数均为 NULL,则 COALESCE 返回 NULL。至少应有一个 Null 值为 NULL 类型。
sql = '''
WITH
prod_info_targ AS
(SELECT t1.product_code AS prod_id,
t1.color_code AS color_id,
t1.size_code AS size_id
FROM edw_ai.dim_sku AS t1
JOIN (SELECT product_year, product_quarter
FROM edw_ai.config_target_product
WHERE day_date = '{1}') AS t2
ON t1.sku_year = t2.product_year AND
t1.sku_quarter = t2.product_quarter),
org_info_targ AS
(SELECT DISTINCT stockorg_code AS org_id
FROM edw_ai.dim_stockorg
WHERE status = '正常' AND org_flag = '1')
SELECT t1.product_code AS prod_id,
t1.color_code AS color_id,
t1.size_code AS size_id,
t1.org_code AS org_id,
t1.sale_date AS date_sell,
SUM(COALESCE(t1.qty, 0)) AS s
FROM edw_ai.fct_sales AS t1
JOIN prod_info_targ AS t2
ON t1.product_code = t2.prod_id AND
t1.color_code = t2.color_id AND
t1.size_code = t2.size_id
JOIN org_info_targ AS t3
ON t1.org_code = t3.org_id
WHERE t1.sale_date BETWEEN '{0}' AND '{1}'
GROUP BY t1.product_code, t1.color_code, t1.size_code,
t1.org_code, t1.sale_date
'''.format((date_decision - relativedelta(weeks=4))
.strftime('%Y-%m-%d'),
(date_decision - relativedelta(days=1))
.strftime('%Y-%m-%d'))
sales_info = spark.sql(sql).toPandas()
sql = '''
select
product_code as prod_id,color_code as color_id ,receive_org_code as org_id ,min(receive_date)as first_date
from edw_ai.fct_stock_move
where send_date BETWEEN '{0}' and '{1}'
group by product_code,color_code,receive_org_code
'''.format((date_decision - relativedelta(years=1))
.strftime('%Y-%m-%d'),
(date_decision - relativedelta(days=1))
.strftime('%Y-%m-%d'))
data_type = {'dt': ['first_date'],
'int': [],
'float': [],
'str': ['prod_id', 'color_id',
'org_id']}
first_rec_date = spark.sql(sql).toPandas()
first_rec_date = self.trans_data(first_rec_date, data_type)
# 转换类型函数 def trans_data(a,b)
df_t = df.copy()
for col in data_type['dt']:
if col in df_t.columns:
df_t[col] = pd.to_datetime(df_t[col])
for col in data_type['int']:
if col in df_t.columns:
df_t[col] = df_t[col].astype('float').astype('int')
for col in data_type['float']:
if col in df_t.columns:
df_t[col] = df_t[col].astype('float')
for col in data_type['str']:
if col in df_t.columns:
df_t[col] = df_t[col].astype('str')
SELECT stockorg_code AS org_id,
province AS prov_id,
city AS city_id,
reserved1 AS dist_id,
org_code AS mng_reg_id,
brand_code,
(CASE org_flag WHEN 1 THEN 1 ELSE 0 END) AS is_store
FROM edw_ai.dim_stockorg
WHERE status = '正常' and ((stockorg_code <> '9686' and org_flag= 2) OR
org_flag = 1 )
COALESCE ( expression,value1,value2……,valuen) COALESCE()函数的第一个参数expression为待检测的表达式,而其后的参数个数不定。 COALESCE()函数将会返回包括expression在内的所有参数中的第一个非空表达式。如果 expression不为空值则返回expression;否则判断value1是否是空值,如果value1不为空值则返 回value1;否则判断value2是否是空值,如果value2不为空值则返回value3;……以此类推, 如果所有的表达式都为空值,则返回NULL。
GROUP BY 统计某一个组的信息。如: SELECT Customer,SUM(OrderPrice) FROM Orders GROUP BY Customer 即统计customer的orderprice之和。
BETWEEN ‘2019-04-15’ AND ‘2019-05-01’ 选取介于两个值之间的数据范围。这些值可以是数值、文本或者日期。 有的是介于两者之间,不包含边界;有的是包含左界,不包含右届。
SUM(t1.qty) AS q