贝塔斯曼数据奖学金-Python部分-1

统计学理论学完后,就到了技术部分,包含Python和SQL(使用的是Postgre)

今天先回顾学过的Python部分:

1.Data Types And Operators数据类型和操作符

首先学的是操作符,简单的加减乘除就和普通一样,比较特殊的是

** 求幂运算,简单来说就是次方,3的6次方 3**6

// 向下整数除法 7//2 =3   -7//2=-4

% 求余数

负值是等号,==是判断是否相等

+= -= *=

 

整数和浮点数

type() 检查参数类型

int(49.7) =49 直接舍去

 

报错会有两种 Exception 和error

exception是运行时报错

error是运行前,检查出来的错误。

 

布尔值true false、逻辑运算and or not


接下来是比较重要的难点

字符串string、列表list、元组tuples、set集合、字典 dictionary

Data Structure Ordered Mutable Constructor Example
int NA NA int() 5
float NA NA float() 6.5
string Yes No ' ' or " " or str() “this is a string”
bool NA NA NA True or False
list Yes Yes [ ] or list() [5, ‘yes’, 5.7]
tuple Yes No ( ) or tuple() (5, ‘yes’, 5.7)
set No Yes { } or set() {5, ‘yes’, 5.7}
dictionary No Keys: No { } or dict() {‘Jun’:75, ‘Jul’:89}

字符串:有序,不可修改

单引号来包含双引号

\ 来转义

+ 来合并字符串,*来复制,没有减法和除法

len()/format()

a.count(b)/a.find(b)


list列表 可排序可改变,使用广泛

names = [“Carol”, “Albert”, “Ben”, “Donna”]

names[0]第一个 ;names[-1]最后一个,[len(string)-1]

month[6:9]  7-9月份,不包含上限。

[:6]前6个元素,不包含7

in 和not in判断

max() 最大的int,字符串首字母靠后的,多重type无法比较-min()

sorted()从下到大排序

sorted(list,reverse=True)从大到小排序

“-“.join(list)连接字符串

b.append(‘one’)添加到列表末尾 直接使用

b=names.pop(1) 提出第二个元素


元组Tuples 包含不可变 有序的数据结构,括号可选

没有append和pop方法

tup1[0] 方括号索引查找

内置4个函数,cmp比较两个元组,max,min和tuple(listname)转化

location = (13.4125, 103.866667)
print("Latitude:", location[0])
print("Longitude:", location[1])

元组解包,就是把元组内的参数赋值,下方第二行就是解包

dimensions = 52, 40, 100
length, width, height = dimensions
print("The dimensions are {} x {} x {}".format(length, width, height))

也可以写成

length, width, height = 52, 40, 100
print("The dimensions are {} x {} x {}".format(length, width, height))

set 集合 包含唯一元素 且 顺序不定,可以修改

scores=[‘4′,’5′,’6′,’9’]

可以由list生成 set(listname)

set.add(‘newone’)添加元素 没有append() 随机添加到一个位置

set.pop()随机删除一个元素


dictionary 字典 存储的是元素对 key:value,数组的概念

key可以是int string float 不可变的值且统一

elements = {“hydrogen”: 1, “helium”: 2, “carbon”: 6}

print(elements[“helium”])

添加就直接添加 elements[“lithium”] = 3

“some” in elements 检查是否存在于字典

elements .get(“some”) 比直接方括号elements.[“some”]要安全,不存咋的话,也只返回None而不是报错

还可以设置成elements .get(“some”,”There\`s no such key”)来自定义返回语句


Compound Data Structures 应该是python的多维数组

elements = {"hydrogen": {"number": 1,
                         "weight": 1.00794,
                         "symbol": "H"},
              "helium": {"number": 2,
                         "weight": 4.002602,
                         "symbol": "He"}}

hydrogen_weight = elements[“hydrogen”][“weight”]

 



控制流 control flow

IF ELIF ELSE

注意冒号和缩进。

points = 174

points = 174  # use this input when submitting your answer

# set prize to default value of None
prize = None

# use the value of points to assign prize to the correct prize name
if points <= 50:
    prize = "wooden rabbit"
elif 151 <= points <= 180:
    prize = "wafer-thin mint"
elif points >= 181:
    prize = "penguin"

# use the truth value of prize to assign result to the correct message
if prize:
    result = "Congratulations! You won a {}!".format(prize)
else:
    result = "Oh dear, no prize this time."

print(result)

 


循环 loop

for和while的使用

usernames = ["Joey Tribbiani", "Monica Geller", "Chandler Bing", "Phoebe Buffay"]

for i in range(len(usernames)):
    usernames[i] = usernames[i].lower().replace(" ", "_")

print(usernames)

while求阶乘

# Start with a sample number for first test - change this when testing your code more!
number = 6   
# We'll always start with our product equal to the number
product = number

# Write while loop header line - how will you tell it when to stop looping?
while  number > 1:
    # Each time through the loop, what do we want to do to our number?
    number -= 1
    # Each time, what do we want to multiply the current product by?
    product *= number
# Print out final product (how do we indicate this should happen after loop ends?)
print(product)

for的方式

# This is the number we'll find the factorial of - change it to test your code!
number = 6
# We'll start with the product equal to the number
product = number

# Write a for loop that calculates the factorial of our number 
for num in range(1, number):
    product *= num

# print the factorial of your number
print(product)

range函数,使用此函数得到list的索引

range(start,stop,step) start 和step默认是1

range(5)只有一个整数时,是stop的值 ,0-4

range(2,10) 是start =2 和stop=10 step默认为1, 2~8

迭代字典

fruit_count, not_fruit_count = 0, 0
basket_items = {'apples': 4, 'oranges': 19, 'kites': 3, 'sandwiches': 8}
fruits = ['apples', 'oranges', 'pears', 'peaches', 'grapes', 'bananas']

#Iterate through the dictionary
for fruit, count in basket_items.items():
    if fruit in fruits:
       fruit_count += count
    else:
        not_fruit_count += count

print("The number of fruits is {}.  There are {} items that are not fruits.".format(fruit_count, not_fruit_count))

 

while 是条件为真时继续运行

card_deck = [4, 11, 8, 5, 13, 2, 8, 10]
hand = []

# adds the last element of the card_deck list to the hand list
# until the values in hand add up to 17 or more
while sum(hand)  < 17:
    hand.append(card_deck.pop())
limit = 40

num = 0
while (num+1)**2 < limit:
    num += 1
nearest_square = num**2

print(nearest_square)

break 和 continue

break终止并跳出循环

continue 跳过本次迭代,而不完全跳出循环

headlines = ["Local Bear Eaten by Man",
             "Legislature Announces New Laws",
             "Peasant Discovers Violence Inherent in System",
             "Cat Rescues Fireman Stuck in Tree",
             "Brave Knight Runs Away",
             "Papperbok Review: Totally Triffic"]

news_ticker = ""
for headline in headlines:
    news_ticker += headline + " "
    if len(news_ticker) >= 140:
        news_ticker = news_ticker[:140]
        break

print(news_ticker)

zip and enumerate 打包和枚举

打包可简单理解成多个list拼成字典

x_coord = [23, 53, 2, -12, 95, 103, 14, -5]
y_coord = [677, 233, 405, 433, 905, 376, 432, 445]
z_coord = [4, 16, -6, -42, 3, -6, 23, -1]
labels = ["F", "J", "A", "Q", "Y", "B", "W", "X"]

points = []
for point in zip(labels, x_coord, y_coord, z_coord):
    points.append("{}: {}, {}, {}".format(*point))

for point in points:
    print(point)

 

cast_names = ["Barney", "Robin", "Ted", "Lily", "Marshall"]
cast_heights = [72, 68, 72, 66, 76]

cast = dict(zip(cast_names, cast_heights))
print(cast)
{'Lily': 66, 'Barney': 72, 'Marshall': 76, 'Ted': 72, 'Robin': 68}

 

枚举 Enumerate

cast = ["Barney Stinson", "Robin Scherbatsky", "Ted Mosby", "Lily Aldrin", "Marshall Eriksen"]
heights = [72, 68, 72, 66, 76]

for i, character in enumerate(cast):
    cast[i] = character + " " + str(heights[i])

print(cast)

 

['Barney Stinson 72', 'Robin Scherbatsky 68', 'Ted Mosby 72', 'Lily Aldrin 66', 'Marshall Eriksen 76']

List Comprehensions 列表推导

应该理解成对list的筛选

multiples_3 = [x * 3 for x in range(1, 21)]
print(multiples_3)
[3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60]