An Interesting Programming Problem on Float Type Errors

4 minute read

My friend introduced a very interesting programming problem to me.
My English translation of the problem is located below. The original problem is located at https://www.acmicpc.net/problem/15328 and was made by https://www.acmicpc.net/user/ho94949.

Title: Santa’s Gift

Santa has to deliver 4 gifts. He starts from his house and tries to stop by each of the 4 houses on a 3-dimensional plane. He finds out that he has X seconds left. He travels at a velocity of 1 km/s, and no time is consumed when placing gifts or changing directions. Find if he can deliver the gifts in time.
Input:
The number of test cases T (1 ≤ T ≤ 20,000) is supplied on the first line. Each test case is composed of 5 lines. The first line is the integer time X seconds (1 ≤ X ≤ 2,000) left until Christmas, and the second to the fifth line has the integers A, B, C separated in spaces. This represents each house being separated A km horizontally, B km vertically, and C km high. If a number is negative, this means that the house is located in the opposite direction with respect to Santa’s house compared to when this value is positive, and the distance is the absolute value of the number. (-100 ≤ A,B,C ≤ 100) Output: Print “YES” if Santa can stop by each of the 4 houses and drop off the gifts, and “NO” if he cannot. Do not print the quotation marks. If Santa takes exactly X seconds, consider that he succeeded.

Test cases:

Input:

4
6
1 1 1
2 2 2
3 3 3
4 4 4
7
-1 -1 -1
-2 -2 -2
-3 -3 -3
-4 -4 -4
350
0 0 100
0 0 0
0 0 -100
0 0 0
6
0 0 0
0 0 2
0 0 3
0 0 0

Output:

NO
YES
NO
YES

Hint: In the first and second input example, Santa takes seconds to stop by all 4 houses. This is an insufficient time with 6 seconds, and a sufficient time with 7 seconds. In the third example, Santa has to travel four houses in order, so he takes 400s, not 350s. In the fourth example, Santa took 6s to visit all 4 houses in order and succeeded in placing all the gifts on time.

This question is related to https://cs.stackexchange.com/questions/84800/is-there-an-polynomial-time-algorithm-to-find-that-sum-of-square-roots-is-less-t, and as this is an open question, the only plausible way to solve this is to approximate the square root value. However, http://blog.kyouko.moe/13 stated that double and long double each has a possible error of 2-53 ≈ 1.110223 × 10-16 and 2-64 ≈ 5.4210109 × 10-20. Thus, he used __float128 for GCC, and this does not even support Visual Studio. This was of sufficient precision for this question, but there may be cases that this would not work if conditions for the problem change. Squaring repeatedly would also solve this problem, but this is probably a problem for CAS programs such as Wolfram Mathematica. After my friend saw this, he said that this question would probably not work with Python 2, but right after I saw this question, I said that this is a problem that can be solved, with BigDecimal in Java and the decimal module in Python. The decimal module is based on saving the digits as a tuple, so it is slower, but with arbitrary precision. He went ahead and used decimal to program it, but it continuously timed out with a precision of around 10-64. After lowering to 10-32, the program succeeded but consumed 46696 KB of memory, and 13556 ms of time.

Code:

https://gist.github.com/anonymous/e00cda2d1fd6ec5a8fb727d609d19980

import sys
from decimal import *
getcontext().prec = 32
def distance(fr,to):
    dis_2 = (fr[0]-to[0])**2+(fr[1]-to[1])**2+(fr[2]-to[2])**2
    return Decimal(str(dis_2))**Decimal('0.5')
def travel():
    N = Decimal(str(int(sys.stdin.readline())))
    L = []
    for i in xrange(4):
        L.append(map(int,sys.stdin.readline().split()))
    a = distance([0,0,0],L[0])
    b = distance(L[0],L[1])
    c = distance(L[1],L[2])
    d = distance(L[2],L[3])
    if a+b+c+d <= N: return 'YES'
    return 'NO'
def main():
    N = int(sys.stdin.readline())
    for i in xrange(N):
        print travel()
main()

Then, he shortcoded to 181 bytes using exec.

https://gist.github.com/anonymous/b9672c4d91614501b9cd8fb44164afd7

from decimal import*;exec"N,k,T=input(),[0]*3,0;exec'L=map(int,raw_input().split());T+=Decimal(sum(map(lambda(a,b):(a-b)**2,zip(k,L)))).sqrt();k=L;'*4;print'YNEOS'[T>N::2];"*input()

Speed optimization to 1160 ms was done by using PyPy. He also used binary search to find out that 10-21 was sufficient to solve this problem.

https://gist.github.com/anonymous/e0c9ad59229192aa3881800ebceb71fd

from decimal import getcontext, Decimal
from sys import stdin
getcontext().prec = 21
def distance(fr,to):
    return Decimal((fr[0]-to[0])**2+(fr[1]-to[1])**2+(fr[2]-to[2])**2).sqrt()
def travel():
    N = int(stdin.readline())
    L = []
    for i in xrange(4):
        L.append(map(int,stdin.readline().split()))
    a = distance([0,0,0],L[0])
    b = distance(L[0],L[1])
    c = distance(L[1],L[2])
    d = distance(L[2],L[3])
    if a+b+c+d <= N: return 'YES'
    return 'NO'
def main():
    for i in xrange(int(stdin.readline())): print travel()
main()

However, C++ codes using __float128 were highly faster and memory efficient, the fastest being 16 ms and spending 2532 KB of memory, since they did not use arbitrary precision decimal classes. The next approach should be using GNU GMP/MPFR, which enables very fast arbitrary precision calculation.

Comments