Numpy

Numpy

Python Basics for AI

ํ•ด๋‹น ์ธ๋„ค์ผ์€ Wonkook Lee ๋‹˜์ด ๋งŒ๋“œ์‹  Thumbnail-Maker ๋ฅผ ์ด์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค

๐Ÿ“ Numpy

ํ•ต์‹ฌ ์ƒ๊ฐ - ์–ด๋–ป๊ฒŒ ํ–‰๋ ฌ๊ณผ ๋งคํŠธ๋ฆญ์Šค๋ฅผ ์ฝ”๋“œ๋กœ ํ‘œํ˜„ํ•  ๊ฒƒ์ธ๊ฐ€?

\[2x_1 + 2x_2 + x_3 = 9 \\2x_1 - x_2 + 2x_3 = 6\\x_1 - x_2 + 2x_3=5\] \[\left[ \begin{matrix} 2 & 2 & 1 & 9\\ 2 & -1 & 2 & 6 \\ 1 & -1 & 2 & 5 \end{matrix} \right]\]
coefficient_matrix = [[2,2,1], [2,-1,2], [1, -1, 2]]
constant_vector = [9,6,5]

Problems

  • ๋‹ค์–‘ํ•œ Matrix ๊ณ„์‚ฐ์„ ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค ๊ฒƒ์ธ๊ฐ€?
  • ๊ต‰์žฅํžˆ ํฐ Matrix ์— ๋Œ€ํ•œ ํ‘œํ˜„
  • ์ฒ˜๋ฆฌ ์†๋„ ๋ฌธ์ œ - python์€ interpreter ์–ธ์–ด

So We should use Numpy!

๐Ÿฆด numpy ํŠน์ง•

  • ์ผ๋ฐ˜ list ์— ๋น„ํ•ด ๋น ๋ฅด๊ณ , ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์ 
  • ๋ฐ˜๋ณต๋ฌธ ์—†์ด ๋ฐ์ดํ„ฐ ๋ฐฐ์—ด์— ๋Œ€ํ•œ ์ฒ˜๋ฆฌ๋ฅผ ์ง€์›ํ•จ
  • ์„ ํ˜•๋Œ€์ˆ˜์™€ ๊ด€๋ จ๋œ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•จ
  • C, C++, fortran ๋“ฑ์˜ ์–ธ์–ด์™€ ํ†ตํ•ฉ ๊ฐ€๋Šฅ

๐Ÿฅ“ array creation

test_arr = np.array(["1",4,5,6]. float)
print(test_arr)
# 1.0 4.0 5.0 6.0 

"""
shape : numpy array ์˜ dimension ๊ตฌ์„ฑ์„ ๋ฐ˜ํ™˜ํ•จ
dtype : numpy array ์˜ ๋ฐ์ดํ„ฐ type ์„ ๋ฐ˜ํ™˜ํ•จ
"""

test_arr = np.array([1,4,5,"8"], float)
test_arr.dtype #float64
test_arr.shape #(4,)

matrix = [[1,2,3,4], [4,5,6,7], [7,8,9,9]]
np.array(matrix,int)

matrix.shape #(3, 4)

๐ŸŽ›๏ธ Handling Shape

reshape

Array ์˜ shape์˜ ํฌ๊ธฐ๋ฅผ ๋ณ€๊ฒฝํ•จ, element ์˜ ๊ฐฏ์ˆ˜๋Š” ๋™์ผ

ex. (2,4)โ†’(8,)

np.array(test_matrix).reshape(2,4).shape
# 2, 4

np.array(test_matrix).reshape(-1,2).shape
# 4, 2

"""
-1 ์˜ ์˜๋ฏธ๋Š” size๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ row/column ๊ฐœ์ˆ˜ ์„ ์ •
"""

flatten

๋‹ค์ฐจ์› array ๋ฅผ 1์ฐจ์› array ๋กœ ๋ณ€ํ™˜

ex. (2,2,4)โ†’(16,)

๐Ÿ‘๏ธโ€๐Ÿ—จ๏ธ indexing for numpy array

  • list์™€ ๋‹ฌ๋ฆฌ ์ด์ฐจ์› ๋ฐฐ์—ด์—์„œ [0,0] ํ‘œ๊ธฐ๋ฒ•์„ ์ œ๊ณต
  • matrix ์ผ ๊ฒฝ์šฐ ์•ž์€ row ๋’ค๋Š” column ์„ ์˜๋ฏธํ•จ
a = np.array([[1,2,3], [4.5, 5, 6]], int)

print(a[0][0]) # 1
print(a[0,0]) # 1

๐Ÿ“ slicing for numpy array

  • list์™€ ๋‹ฌ๋ฆฌ ํ–‰๊ณผ ์—ด ๋ถ€๋ถ„์„ ๋‚˜๋ˆ ์„œ slicing ์ด ๊ฐ€๋Šฅํ•จ
  • matrix ์˜ ๋ถ€๋ถ„์ง‘ํ•ฉ์„ ์ถ”์ถœํ•  ๋•Œ ์œ ์šฉํ•จ
a = np.array([[1,2,3,4,5], [6,7,8,9,10]], int)
a[:, 2:] # ์ „์ฒด row์˜ 2์—ด ์ด์ƒ
#[[3,4,5], [8,9,10]]

a[1, 1:3] # 1 row ์˜ 1์—ด~2์—ด
# [7,8]

a[1:3] # 1 Row ~ 2 Row ์˜ ์ „์ฒด
# [6,7,8,9,10]

๐Ÿ”ฅ creation function

arange

array ์˜ ๋ฒ”์œ„๋ฅผ ์ง€์ •ํ•˜์—ฌ, ๊ฐ’์˜ list ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ช…๋ น์–ด

np.arange(5) #array([0,1,2,3,4])

np.arange(0,5,0.5) #(start, end, step)
# array([0., 0.5, 1., ... , 4.5])

zeros - 0์œผ๋กœ ๊ฐ€๋“์ฐฌ ndarray ์ƒ์„ฑ

np.zeros(shape=(10,), dtype=np.int8)
# array([0,0,0,0,0,0,0,0,0], dtype=int8)

np.zeros((2,5))
# array([[0., 0., 0., 0., 0.],
#        [0., 0., 0., 0., 0.]])

identity

๋‹จ์œ„ ํ–‰๋ ฌ (i ํ–‰๋ ฌ) ์„ ์ƒ์„ฑํ•จ

np.identity(n=3, dtype=np.int8)
# array([1,0,0],
#       [0,1,0],
#       [0,0,1], dtype=int8)

np.identity(5)

diag

๋Œ€๊ฐ ํ–‰๋ ฌ์˜ ๊ฐ’์„ ์ถ”์ถœํ•จ

matrix = np.arange(9).reshape(3,3)
np.diag(matrix)
# array([0,4,8])

random sampling

๋ฐ์ดํ„ฐ ๋ถ„ํฌ์— ๋”ฐ๋ฅธ sampling ์œผ๋กœ array ๋ฅผ ์ƒ์„ฑ

np.random.uniform(0,1,10).reshape(2,5) #๊ท ๋“ฑ๋ถ„ํฌ

๐ŸŽš๏ธ operation function

sum

ndarray ์˜ element๋“ค ๊ฐ„์˜ ํ•ฉ์„ ๊ตฌํ•จ, ๋ฆฌ์ŠคํŠธ์˜ sum ๊ธฐ๋Šฅ๊ณผ ๋™์ผ

test_arr = np.arange(1,11)
test_arr.sum(dtype=np.float)

#55.0

axis ๊ฐœ๋…

๋ชจ๋“  operation function ์„ ์‹คํ–‰ํ•  ๋•Œ ๊ธฐ์ค€์ด ๋˜๋Š” dimension ์ถ•

โฌ‡๏ธ : axis=0

โžก๏ธ : axis=1

test_arr = np.arange(1,13).reshape(3,4)
test_arr.sum(axis=1), test_arr.sum(axis=0)

"""
1  2  3  4
5  6  7  8
9 10 11 12

(array([10,26,42], array([15,18,21,24])
"""

mean & std

ndarray์˜ element๋“ค ๊ฐ„์˜ ํ‰๊ท  ๋˜๋Š” ํ‘œ์ค€ ํŽธ์ฐจ๋ฅผ ๋ฐ˜ํ™˜

test_arr = np.arange(1,13).reshape(3,4)

"""
1  2  3  4
5  6  7  8
9 10 11 12
"""

test_arr.mean(), test_arr.mean(axis=0)
(6.5, array([5., 6., 7., 8.]))

concatenate

numpy array ๋ฅผ ํ•ฉ์น˜๋Š” ํ•จ์ˆ˜

vstack & hstack

"""
1 2 3       1 2 3
        ->  2 3 4 
2 3 4
"""

a = np.arange(1,4).reshape(3,)
b = np.arange(2,5).reshape(3,)

np.vstack((a,b))
# array([1,2,3]
#        [2,3,4)

"""
1    2     1 2  
2    3  -> 2 3
3    4     3 4
"""

a = np.array([[1], [2], [3]])
b = np.array([[2], [3], [4]])

np.hstack((a,b))
# array([1,2]
#       [2,3],
#       [3,4] )

concatenate

numpy array ๋ฅผ ๋ถ™์ด๋Š” ํ•จ์ˆ˜

# ์œ„์™€ ๋˜‘๊ฐ™์€ array

a = np.arange(1,4).reshape(3,)
b = np.arange(2,5).reshape(3,)

np.concatenate((a,b), axis =0)

a = np.array([[1,2], [3,4]]
b = np.array([5,6]) #shape = (2,)
b = b[np.newaxis, :] # ์ถ•์„ ํ•˜๋‚˜ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ
b.shape #(1,2)

"""
๋„˜ํŒŒ์ด ๊ฐ์ฒด์— .T๋ฅผ ํ•˜๋ฉด shape์ด (2, 1)๋กœ ๋ณ€ํ•˜๊ฒŒ ๋จ = transpose ์—ฐ์‚ฐ
"""

np.concatenate((a, b.T), axis=1)

Operations betwwen arrays

numpy ๋Š” array ๊ฐ„์˜ ๊ธฐ๋ณธ์ ์ธ ์‚ฌ์น™ ์—ฐ์‚ฐ์„ ์ง€์›ํ•จ

basic

test_a = np.array([[1,2,3], [4,5,6]], float)
test_a + test_a #Matrix + Matrix ์—ฐ์‚ฐ

"""
array([[2., 4., 6.,]
      [8., 10., 12.]])
"""

test_a - test_a

"""
array([[0., 0., 0.,]
      [0., 0., 0.]])
"""

test_a * test_a #Matrix ๋‚ด element ๋“ค ๊ฐ„ ๊ฐ™์€ ์œ„์น˜์— ์žˆ๋Š” ๊ฐ’๋“ค๋ผ๋ฆฌ ์—ฐ์‚ฐ
                # shape ์ด ๊ฐ™์„ ๋•Œ ์ผ์–ด๋‚จ

"""
array([[1., 4., 9.,]
      [16., 25., 36.]])
"""

Dot product

Matrix ์˜ ๊ธฐ๋ณธ ์—ฐ์‚ฐ, dot ํ•จ์ˆ˜ ์‚ฌ์šฉ

test_a = np.arange(1,7).reshape(2,3)
test_b = np.arange(7,13).reshape(3,2)

test_a.dot(test_b)

"""
array([[58, 64]
      [139, 154]])
"""

transpose

์ „์น˜ํ–‰๋ ฌ

test_a = np.arange(1,7).reshape(2,3)
"""
[[1 2 3]
 [4 5 6]]
"""

test_a.T
"""
[[1 4]
 [2 5]
 [3 6]]
"""

broadcasting

shape ์ด ๋‹ค๋ฅธ ๋ฐฐ์—ด ๊ฐ„ ์—ฐ์‚ฐ์„ ์ง€์›ํ•˜๋Š” ๊ธฐ๋Šฅ

test_matrix = np.array(1,7).reshape(2,3)
scalar = 3.

test_matrix + scalar
"""
array([[4., 5., 6.],
       [7., 8., 9.]])
"""

"""
scalar - vector ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ
vector - vector ๋„ ์ง€์›
์ด ๊ฒฝ์šฐ ๋‘˜์˜ shape ์„ ์•Œ์•„์„œ ๋งž์ถฐ์คŒ
"""

โš”๏ธ Comparisons

all & any

a = np.arange(10)

a < 4

"""
array([ True,  True,  True,  True, False, False, False, False, False,
       False])
"""

np.any(a>5), np.any(a<0) # ํ•˜๋‚˜๋ผ๋„ ์กฐ๊ฑด์— ๋งŒ์กฑํ•˜๋ฉด True
# True, False

np.all(a>5), np.all(a<10) # ๋ชจ๋‘๊ฐ€ ์กฐ๊ฑด์— ๋งŒ์กฑํ•ด์•ผ True
# False, True

comparison operation #1

numpy ๋Š” ๋ฐฐ์—ด์˜ ํฌ๊ธฐ๊ฐ€ ๋™์ผํ•  ๋•Œ element ๊ฐ„ ๋น„๊ต์˜ ๊ฒฐ๊ณผ๋ฅผ boolean type ์œผ๋กœ ๋ฐ˜ํ™˜

test_a = np.array([1,3,0], float)
test_b = np.array([5,2,1], float)

test_a > test_b
# array([False,  True, False])

comparison operation #2

a = np.array([1,3,0], float)
np.logical_and(a > 0, a < 3) #and ์กฐ๊ฑด์˜ condition
# array([True, False, False])

b = np.array([True, False, True], bool)
np.logical_not(b)
# array([False, True, False])

c = np.array([False, True, False], bool)
np.logical_or(b,c) # OR ์กฐ๊ฑด์˜ condition
# array([True, True, True])

np.where

np.where(a>0, 3, 2) #where(condition, TRUE, FALSE)
# [3, 3, 2]

a = np.arange(10)
np.where(a>5)
# (array([6, 7, 8, 9]),)

argmax & argmin

array ๋‚ด ์ตœ๋Œ€๊ฐ’ ๋˜๋Š” ์ตœ์†Ÿ๊ฐ’์˜ index ๋ฅผ ๋ฐ˜ํ™˜

a = np.array([1,2,4,5,8,78,23,3])
np.argmax(a), np.argmin(a)
#(5,0)

axis ๊ธฐ๋ฐ˜์˜ ๋ฐ˜ํ™˜

a = np.array([[1,2,4,7], [9,88,6,45], [9,76,3,4]])
np.argmax(a, axis=1), np.argmax(a, axis=0)

#(array([3, 1, 1]), array([1, 1, 1, 1]))

"""
1  2  4  7
9 88  6 45
9 76  3  4

axis = 0 ์ผ๋•Œ๋Š” 9 88 6 45 ๊ฐ€ max ์ด๋ฏ€๋กœ ๊ทธ์— ๋งž๋Š” ์ธ๋ฑ์Šค
axis = 1 ์ผ๋•Œ๋Š” 7 88 76 ์ด max ์ด๋ฏ€๋กœ ๊ทธ์— ๋งž๋Š” ์ธ๋ฐ์Šค
"""

๐ŸŽธ boolean & fancy index

ํŠน์ • ์กฐ๊ฑด์— ๋”ฐ๋ฅธ ๊ฐ’์„ ๋ฐฐ์—ด ํ˜•ํƒœ๋กœ ์ถ”์ถœ

Comparison operation ํ•จ์ˆ˜๋“ค๋„ ๋ชจ๋‘ ์‚ฌ์šฉ๊ฐ€๋Šฅ

boolean indexing

test_arr = np.array([1,4,0,2,3,8,9,7], float)
test_arr[test_arr > 3]

"""
array([4., 8., 9., 7.])
"""

fancy indexing

numpy๋Š” array ๋ฅผ index value๋กœ ์‚ฌ์šฉํ•ด์„œ ๊ฐ’ ์ถ”์ถœ

a = np.array([2,4,6,8], float)
b = np.array([0,0,1,3,2,1], int) # ๋ฐ˜๋“œ์‹œ integer ๋กœ ์„ ์–ธ

a[b] #a.take(b) ์™€ ๊ฐ™์Œ

#array([2., 2., 4., 8., 6., 4.])

๐Ÿ‘“ numpy data i/o

loadtxt & savetxt

text type ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๊ณ , ์ €์žฅํ•˜๋Š” ๊ธฐ๋Šฅ

a = np.loadtxt("./populations.txt") #ํŒŒ์ผ ํ˜ธ์ถœ
a[:10]

a_int = a.astype(int)
np.savetxt("int_data.csv", a_int, delimiter=",") #ํŒŒ์ผ ์ €์žฅ

#numpy object - npy
np.save("npy_test_object", arr=a_int)
a_test = np.load(file="npy_test_object.npy")

numpy ์— ๋Œ€ํ•œ ๊ธฐ๋ณธ์ ์ธ ๊ฒƒ์„ ๋ฐฐ์›Œ๋ณด์•˜๋‹ค. ์ƒ์†Œํ•œ ๊ฐœ๋…๋“ค์ด ๋งŽ์•˜์œผ๋ฏ€๋กœ ๋ณต์Šต ๋งŽ์ด ํ•˜์ž!