Numpy
Python Basics for AI
- ๐ฆด numpy ํน์ง
- ๐ฅ array creation
- ๐๏ธ Handling Shape
- ๐๏ธโ๐จ๏ธ indexing for numpy array
- ๐ slicing for numpy array
- ๐ฅ creation function
- ๐๏ธ operation function
- โ๏ธ Comparisons
- ๐ธ boolean & fancy index
- ๐ numpy data i/o
๐ Numpy
\[2x_1 + 2x_2 + x_3 = 9 \\2x_1 - x_2 + 2x_3 = 6\\x_1 - x_2 + 2x_3=5\] \[\left[ \begin{matrix} 2 & 2 & 1 & 9\\ 2 & -1 & 2 & 6 \\ 1 & -1 & 2 & 5 \end{matrix} \right]\]
ํต์ฌ ์๊ฐ
- ์ด๋ป๊ฒ ํ๋ ฌ๊ณผ ๋งคํธ๋ฆญ์ค๋ฅผ ์ฝ๋๋ก ํํํ ๊ฒ์ธ๊ฐ?
coefficient_matrix = [[2,2,1], [2,-1,2], [1, -1, 2]]
constant_vector = [9,6,5]
Problems
- ๋ค์ํ Matrix ๊ณ์ฐ์ ์ด๋ป๊ฒ ๋ง๋ค ๊ฒ์ธ๊ฐ?
- ๊ต์ฅํ ํฐ Matrix ์ ๋ํ ํํ
- ์ฒ๋ฆฌ ์๋ ๋ฌธ์ -
python์ interpreter ์ธ์ด
So We should use Numpy!
๐ฆด numpy ํน์ง
- ์ผ๋ฐ list ์ ๋นํด ๋น ๋ฅด๊ณ , ๋ฉ๋ชจ๋ฆฌ ํจ์จ์
- ๋ฐ๋ณต๋ฌธ ์์ด ๋ฐ์ดํฐ ๋ฐฐ์ด์ ๋ํ ์ฒ๋ฆฌ๋ฅผ ์ง์ํจ
- ์ ํ๋์์ ๊ด๋ จ๋ ๋ค์ํ ๊ธฐ๋ฅ์ ์ ๊ณตํจ
- C, C++, fortran ๋ฑ์ ์ธ์ด์ ํตํฉ ๊ฐ๋ฅ
๐ฅ array creation
test_arr = np.array(["1",4,5,6]. float)
print(test_arr)
# 1.0 4.0 5.0 6.0
"""
shape : numpy array ์ dimension ๊ตฌ์ฑ์ ๋ฐํํจ
dtype : numpy array ์ ๋ฐ์ดํฐ type ์ ๋ฐํํจ
"""
test_arr = np.array([1,4,5,"8"], float)
test_arr.dtype #float64
test_arr.shape #(4,)
matrix = [[1,2,3,4], [4,5,6,7], [7,8,9,9]]
np.array(matrix,int)
matrix.shape #(3, 4)
๐๏ธ Handling Shape
reshape
Array ์ shape์ ํฌ๊ธฐ๋ฅผ ๋ณ๊ฒฝํจ, element ์ ๊ฐฏ์๋ ๋์ผ
ex. (2,4)โ(8,)
np.array(test_matrix).reshape(2,4).shape
# 2, 4
np.array(test_matrix).reshape(-1,2).shape
# 4, 2
"""
-1 ์ ์๋ฏธ๋ size๋ฅผ ๊ธฐ๋ฐ์ผ๋ก row/column ๊ฐ์ ์ ์
"""
flatten
๋ค์ฐจ์ array ๋ฅผ 1์ฐจ์ array ๋ก ๋ณํ
ex. (2,2,4)โ(16,)
๐๏ธโ๐จ๏ธ indexing for numpy array
- list์ ๋ฌ๋ฆฌ ์ด์ฐจ์ ๋ฐฐ์ด์์ [0,0] ํ๊ธฐ๋ฒ์ ์ ๊ณต
- matrix ์ผ ๊ฒฝ์ฐ ์์
row
๋ค๋column
์ ์๋ฏธํจ
a = np.array([[1,2,3], [4.5, 5, 6]], int)
print(a[0][0]) # 1
print(a[0,0]) # 1
๐ slicing for numpy array
- list์ ๋ฌ๋ฆฌ ํ๊ณผ ์ด ๋ถ๋ถ์ ๋๋ ์ slicing ์ด ๊ฐ๋ฅํจ
- matrix ์ ๋ถ๋ถ์งํฉ์ ์ถ์ถํ ๋ ์ ์ฉํจ
a = np.array([[1,2,3,4,5], [6,7,8,9,10]], int)
a[:, 2:] # ์ ์ฒด row์ 2์ด ์ด์
#[[3,4,5], [8,9,10]]
a[1, 1:3] # 1 row ์ 1์ด~2์ด
# [7,8]
a[1:3] # 1 Row ~ 2 Row ์ ์ ์ฒด
# [6,7,8,9,10]
๐ฅ creation function
arange
array ์ ๋ฒ์๋ฅผ ์ง์ ํ์ฌ, ๊ฐ์ list ๋ฅผ ์์ฑํ๋ ๋ช ๋ น์ด
np.arange(5) #array([0,1,2,3,4])
np.arange(0,5,0.5) #(start, end, step)
# array([0., 0.5, 1., ... , 4.5])
zeros - 0์ผ๋ก ๊ฐ๋์ฐฌ ndarray ์์ฑ
np.zeros(shape=(10,), dtype=np.int8)
# array([0,0,0,0,0,0,0,0,0], dtype=int8)
np.zeros((2,5))
# array([[0., 0., 0., 0., 0.],
# [0., 0., 0., 0., 0.]])
identity
๋จ์ ํ๋ ฌ (i ํ๋ ฌ) ์ ์์ฑํจ
np.identity(n=3, dtype=np.int8)
# array([1,0,0],
# [0,1,0],
# [0,0,1], dtype=int8)
np.identity(5)
diag
๋๊ฐ ํ๋ ฌ์ ๊ฐ์ ์ถ์ถํจ
matrix = np.arange(9).reshape(3,3)
np.diag(matrix)
# array([0,4,8])
random sampling
๋ฐ์ดํฐ ๋ถํฌ์ ๋ฐ๋ฅธ sampling ์ผ๋ก array ๋ฅผ ์์ฑ
np.random.uniform(0,1,10).reshape(2,5) #๊ท ๋ฑ๋ถํฌ
๐๏ธ operation function
sum
ndarray ์ element๋ค ๊ฐ์ ํฉ์ ๊ตฌํจ, ๋ฆฌ์คํธ์ sum ๊ธฐ๋ฅ๊ณผ ๋์ผ
test_arr = np.arange(1,11)
test_arr.sum(dtype=np.float)
#55.0
axis ๊ฐ๋
๋ชจ๋ operation function ์ ์คํํ ๋ ๊ธฐ์ค์ด ๋๋ dimension ์ถ
โฌ๏ธ : axis=0
โก๏ธ : axis=1
test_arr = np.arange(1,13).reshape(3,4)
test_arr.sum(axis=1), test_arr.sum(axis=0)
"""
1 2 3 4
5 6 7 8
9 10 11 12
(array([10,26,42], array([15,18,21,24])
"""
mean & std
ndarray์ element๋ค ๊ฐ์ ํ๊ท ๋๋ ํ์ค ํธ์ฐจ๋ฅผ ๋ฐํ
test_arr = np.arange(1,13).reshape(3,4)
"""
1 2 3 4
5 6 7 8
9 10 11 12
"""
test_arr.mean(), test_arr.mean(axis=0)
(6.5, array([5., 6., 7., 8.]))
concatenate
numpy array ๋ฅผ ํฉ์น๋ ํจ์
vstack & hstack
"""
1 2 3 1 2 3
-> 2 3 4
2 3 4
"""
a = np.arange(1,4).reshape(3,)
b = np.arange(2,5).reshape(3,)
np.vstack((a,b))
# array([1,2,3]
# [2,3,4)
"""
1 2 1 2
2 3 -> 2 3
3 4 3 4
"""
a = np.array([[1], [2], [3]])
b = np.array([[2], [3], [4]])
np.hstack((a,b))
# array([1,2]
# [2,3],
# [3,4] )
concatenate
numpy array ๋ฅผ ๋ถ์ด๋ ํจ์
# ์์ ๋๊ฐ์ array
a = np.arange(1,4).reshape(3,)
b = np.arange(2,5).reshape(3,)
np.concatenate((a,b), axis =0)
a = np.array([[1,2], [3,4]]
b = np.array([5,6]) #shape = (2,)
b = b[np.newaxis, :] # ์ถ์ ํ๋ ์ถ๊ฐํ๋ ๊ฒ
b.shape #(1,2)
"""
๋ํ์ด ๊ฐ์ฒด์ .T๋ฅผ ํ๋ฉด shape์ด (2, 1)๋ก ๋ณํ๊ฒ ๋จ = transpose ์ฐ์ฐ
"""
np.concatenate((a, b.T), axis=1)
Operations betwwen arrays
numpy ๋ array ๊ฐ์ ๊ธฐ๋ณธ์ ์ธ ์ฌ์น ์ฐ์ฐ์ ์ง์ํจ
basic
test_a = np.array([[1,2,3], [4,5,6]], float)
test_a + test_a #Matrix + Matrix ์ฐ์ฐ
"""
array([[2., 4., 6.,]
[8., 10., 12.]])
"""
test_a - test_a
"""
array([[0., 0., 0.,]
[0., 0., 0.]])
"""
test_a * test_a #Matrix ๋ด element ๋ค ๊ฐ ๊ฐ์ ์์น์ ์๋ ๊ฐ๋ค๋ผ๋ฆฌ ์ฐ์ฐ
# shape ์ด ๊ฐ์ ๋ ์ผ์ด๋จ
"""
array([[1., 4., 9.,]
[16., 25., 36.]])
"""
Dot product
Matrix ์ ๊ธฐ๋ณธ ์ฐ์ฐ, dot ํจ์ ์ฌ์ฉ
test_a = np.arange(1,7).reshape(2,3)
test_b = np.arange(7,13).reshape(3,2)
test_a.dot(test_b)
"""
array([[58, 64]
[139, 154]])
"""
transpose
์ ์นํ๋ ฌ
test_a = np.arange(1,7).reshape(2,3)
"""
[[1 2 3]
[4 5 6]]
"""
test_a.T
"""
[[1 4]
[2 5]
[3 6]]
"""
broadcasting
shape ์ด ๋ค๋ฅธ ๋ฐฐ์ด ๊ฐ ์ฐ์ฐ์ ์ง์ํ๋ ๊ธฐ๋ฅ
test_matrix = np.array(1,7).reshape(2,3)
scalar = 3.
test_matrix + scalar
"""
array([[4., 5., 6.],
[7., 8., 9.]])
"""
"""
scalar - vector ๋ฟ๋ง ์๋๋ผ
vector - vector ๋ ์ง์
์ด ๊ฒฝ์ฐ ๋์ shape ์ ์์์ ๋ง์ถฐ์ค
"""
โ๏ธ Comparisons
all & any
a = np.arange(10)
a < 4
"""
array([ True, True, True, True, False, False, False, False, False,
False])
"""
np.any(a>5), np.any(a<0) # ํ๋๋ผ๋ ์กฐ๊ฑด์ ๋ง์กฑํ๋ฉด True
# True, False
np.all(a>5), np.all(a<10) # ๋ชจ๋๊ฐ ์กฐ๊ฑด์ ๋ง์กฑํด์ผ True
# False, True
comparison operation #1
numpy ๋ ๋ฐฐ์ด์ ํฌ๊ธฐ๊ฐ ๋์ผํ ๋ element ๊ฐ ๋น๊ต์ ๊ฒฐ๊ณผ๋ฅผ boolean type ์ผ๋ก ๋ฐํ
test_a = np.array([1,3,0], float)
test_b = np.array([5,2,1], float)
test_a > test_b
# array([False, True, False])
comparison operation #2
a = np.array([1,3,0], float)
np.logical_and(a > 0, a < 3) #and ์กฐ๊ฑด์ condition
# array([True, False, False])
b = np.array([True, False, True], bool)
np.logical_not(b)
# array([False, True, False])
c = np.array([False, True, False], bool)
np.logical_or(b,c) # OR ์กฐ๊ฑด์ condition
# array([True, True, True])
np.where
np.where(a>0, 3, 2) #where(condition, TRUE, FALSE)
# [3, 3, 2]
a = np.arange(10)
np.where(a>5)
# (array([6, 7, 8, 9]),)
argmax & argmin
array ๋ด ์ต๋๊ฐ ๋๋ ์ต์๊ฐ์
index
๋ฅผ ๋ฐํ
a = np.array([1,2,4,5,8,78,23,3])
np.argmax(a), np.argmin(a)
#(5,0)
axis ๊ธฐ๋ฐ์ ๋ฐํ
a = np.array([[1,2,4,7], [9,88,6,45], [9,76,3,4]])
np.argmax(a, axis=1), np.argmax(a, axis=0)
#(array([3, 1, 1]), array([1, 1, 1, 1]))
"""
1 2 4 7
9 88 6 45
9 76 3 4
axis = 0 ์ผ๋๋ 9 88 6 45 ๊ฐ max ์ด๋ฏ๋ก ๊ทธ์ ๋ง๋ ์ธ๋ฑ์ค
axis = 1 ์ผ๋๋ 7 88 76 ์ด max ์ด๋ฏ๋ก ๊ทธ์ ๋ง๋ ์ธ๋ฐ์ค
"""
๐ธ boolean & fancy index
ํน์ ์กฐ๊ฑด์ ๋ฐ๋ฅธ ๊ฐ์ ๋ฐฐ์ด ํํ๋ก ์ถ์ถ
Comparison operation ํจ์๋ค๋ ๋ชจ๋ ์ฌ์ฉ๊ฐ๋ฅ
boolean indexing
test_arr = np.array([1,4,0,2,3,8,9,7], float)
test_arr[test_arr > 3]
"""
array([4., 8., 9., 7.])
"""
fancy indexing
numpy๋ array ๋ฅผ index value๋ก ์ฌ์ฉํด์ ๊ฐ ์ถ์ถ
a = np.array([2,4,6,8], float)
b = np.array([0,0,1,3,2,1], int) # ๋ฐ๋์ integer ๋ก ์ ์ธ
a[b] #a.take(b) ์ ๊ฐ์
#array([2., 2., 4., 8., 6., 4.])
๐ numpy data i/o
loadtxt & savetxt
text type ์ ๋ฐ์ดํฐ๋ฅผ ์ฝ๊ณ , ์ ์ฅํ๋ ๊ธฐ๋ฅ
a = np.loadtxt("./populations.txt") #ํ์ผ ํธ์ถ
a[:10]
a_int = a.astype(int)
np.savetxt("int_data.csv", a_int, delimiter=",") #ํ์ผ ์ ์ฅ
#numpy object - npy
np.save("npy_test_object", arr=a_int)
a_test = np.load(file="npy_test_object.npy")
numpy
์ ๋ํ ๊ธฐ๋ณธ์ ์ธ ๊ฒ์ ๋ฐฐ์๋ณด์๋ค. ์์ํ ๊ฐ๋
๋ค์ด ๋ง์์ผ๋ฏ๋ก ๋ณต์ต ๋ง์ด ํ์!