Skip to content

Commit

Permalink
Change how we trigger OOMing.
Browse files Browse the repository at this point in the history
  • Loading branch information
holdenk committed Sep 5, 2023
1 parent 6a29d69 commit 7625d9e
Showing 1 changed file with 6 additions and 8 deletions.
14 changes: 6 additions & 8 deletions python/examples/bad_pyspark.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,22 +131,20 @@ def loggedDivZero(x):

def runOutOfMemory(sc):
"""
Run out of memory on the workers.
In standalone modes results in a memory error, but in YARN may trigger YARN container
overhead errors.
Run out of memory on the workers from a skewed shuffle.
>>> runOutOfMemory(sc)
Traceback (most recent call last):
...
Py4JJavaError:...
"""
# tag::worker_oom[]
data = sc.parallelize(range(10))
data = sc.parallelize(range(10000))

def generate_too_much(itr):
return range(10000000000000)
def generate_too_much(i: int):
return list(map(lambda v: (i % 2, v), range(100000 * i)))

itr = data.flatMap(generate_too_much)
itr.count()
bad = data.flatMap(generate_too_much).groupByKey()
bad.count()
# end::worker_oom[]


Expand Down

0 comments on commit 7625d9e

Please sign in to comment.