除了社交、娛樂(lè),人們外出使用最多的軟件可能就是各種地圖導(dǎo)航軟件了,最常用的比如高德和百度地圖。 打開(kāi)軟件,雙指滑動(dòng)手機(jī)屏幕,放大縮小地圖的時(shí)候,你可能會(huì)注意到這樣的情況:放大地圖,道路變得崎嶇,彎來(lái)繞去;縮小地圖,道路一下子變得筆直。就像下面這種情況,造成這種情況的原因就是 RDP 算法或者是相關(guān)的改進(jìn)算法。 同樣在 GIS 軟件中的抽稀和概化功能就是相應(yīng)算法的實(shí)現(xiàn)。 左:道路較為筆直;右:道路較為曲折 什么是 RDP 算法RDP 算法的全稱(chēng)是拉默-道格拉斯-普克算法(英文:Ramer–Douglas–Peucker algorithm),分別是三位科學(xué)家的名字,又稱(chēng)道格拉斯-普克算法(DP),也稱(chēng)迭代端點(diǎn)擬合算法(英語(yǔ):iterative end-point fit algorithm)。 這是一種將曲線(xiàn)(折線(xiàn))降采樣為點(diǎn)數(shù)較少的類(lèi)似曲線(xiàn)(折線(xiàn))的算法,簡(jiǎn)單來(lái)說(shuō)就是簡(jiǎn)化;是線(xiàn)狀要素抽稀的經(jīng)典算法(也可以叫概化),在保證幾何形狀基本不變的情況下,去除大量冗余折點(diǎn),縮小體積。 這對(duì)于網(wǎng)絡(luò)地圖是非常重要的,能減少加載時(shí)間,增強(qiáng)程序穩(wěn)定性,保證終端設(shè)備的流暢,進(jìn)而提升用戶(hù)的體驗(yàn)。 RDP 算法的應(yīng)用非常廣泛,特別是 GIS 領(lǐng)域。 與 RDP 相似的另一個(gè)著名的算法是Visvalingam-Whyatt。大家可登陸 mapshaper.org 網(wǎng)站在線(xiàn)使用這兩種算法,可調(diào)節(jié)參數(shù),可導(dǎo)出處理后的 shp 數(shù)據(jù)。 算法基本原理將待處理曲線(xiàn)(折線(xiàn))的首末端點(diǎn)連成一條直線(xiàn),求所有中間點(diǎn)與直線(xiàn)的距離,并找出最大距離 max,用 max 與抽稀閾值 ε 相比較: 若 max <= ε,這條曲線(xiàn)上的中間點(diǎn)全部舍去; 若 max > ε,保留 max 對(duì)應(yīng)的坐標(biāo)點(diǎn),并以該點(diǎn)為界,把曲線(xiàn)分為兩部分,對(duì)這兩部分曲線(xiàn)重復(fù)上述過(guò)程,直至所有的點(diǎn)都被處理完成; 最后將首尾端點(diǎn)和保存下來(lái)的點(diǎn)相連,獲得簡(jiǎn)化后的曲線(xiàn)(折現(xiàn))。 沒(méi)人看的代碼實(shí)現(xiàn)Python 已經(jīng)有實(shí)現(xiàn)其算法的第三方庫(kù) rdp,你可以通過(guò)以下命令安裝。 pip install rdp該模塊非常簡(jiǎn)潔,僅專(zhuān)注于實(shí)現(xiàn) rdp 算法,其 Python 源代碼如下: """rdp ~~~ Python implementation of the Ramer-Douglas-Peucker algorithm. :copyright: 2014-2016 Fabian Hirschmann <fabian@hirschmann.email> :license: MIT, see LICENSE.txt for more details. """ from math import sqrt from functools import partial import numpy as np import sys if sys.version_info[0] >= 3: xrange = range def pldist(point, start, end): """ Calculates the distance from ``point`` to the line given by the points ``start`` and ``end``. :param point: a point :type point: numpy array :param start: a point of the line :type start: numpy array :param end: another point of the line :type end: numpy array """ if np.all(np.equal(start, end)): return np.linalg.norm(point - start) return np.divide( np.abs(np.linalg.norm(np.cross(end - start, start - point))), np.linalg.norm(end - start)) def rdp_rec(M, epsilon, dist=pldist): """ Simplifies a given array of points. Recursive version. :param M: an array :type M: numpy array :param epsilon: epsilon in the rdp algorithm :type epsilon: float :param dist: distance function :type dist: function with signature ``f(point, start, end)`` -- see :func:`rdp.pldist` """ dmax = 0.0 index = -1 for i in xrange(1, M.shape[0]): d = dist(M[i], M[0], M[-1]) if d > dmax: index = i dmax = d if dmax > epsilon: r1 = rdp_rec(M[:index + 1], epsilon, dist) r2 = rdp_rec(M[index:], epsilon, dist) return np.vstack((r1[:-1], r2)) else: return np.vstack((M[0], M[-1])) def _rdp_iter(M, start_index, last_index, epsilon, dist=pldist): stk = [] stk.append([start_index, last_index]) global_start_index = start_index indices = np.ones(last_index - start_index + 1, dtype=bool) while stk: start_index, last_index = stk.pop() dmax = 0.0 index = start_index for i in xrange(index + 1, last_index): if indices[i - global_start_index]: d = dist(M[i], M[start_index], M[last_index]) if d > dmax: index = i dmax = d if dmax > epsilon: stk.append([start_index, index]) stk.append([index, last_index]) else: for i in xrange(start_index + 1, last_index): indices[i - global_start_index] = False return indices def rdp_iter(M, epsilon, dist=pldist, return_mask=False): """ Simplifies a given array of points. Iterative version. :param M: an array :type M: numpy array :param epsilon: epsilon in the rdp algorithm :type epsilon: float :param dist: distance function :type dist: function with signature ``f(point, start, end)`` -- see :func:`rdp.pldist` :param return_mask: return the mask of points to keep instead :type return_mask: bool """ mask = _rdp_iter(M, 0, len(M) - 1, epsilon, dist) if return_mask: return mask return M[mask] def rdp(M, epsilon=0, dist=pldist, algo="iter", return_mask=False): """ Simplifies a given array of points using the Ramer-Douglas-Peucker algorithm. Example: # >>> from rdp import rdp # >>> rdp([[1, 1], [2, 2], [3, 3], [4, 4]]) [[1, 1], [4, 4]] This is a convenience wrapper around both :func:`rdp.rdp_iter` and :func:`rdp.rdp_rec` that detects if the input is a numpy array in order to adapt the output accordingly. This means that when it is called using a Python list as argument, a Python list is returned, and in case of an invocation using a numpy array, a NumPy array is returned. The parameter ``return_mask=True`` can be used in conjunction with ``algo="iter"`` to return only the mask of points to keep. Example: # >>> from rdp import rdp # >>> import numpy as np # >>> arr = np.array([1, 1, 2, 2, 3, 3, 4, 4]).reshape(4, 2) # >>> arr array([[1, 1], [2, 2], [3, 3], [4, 4]]) # >>> mask = rdp(arr, algo="iter", return_mask=True) # >>> mask array([ True, False, False, True], dtype=bool) # >>> arr[mask] array([[1, 1], [4, 4]]) :param M: a series of points :type M: numpy array with shape ``(n,d)`` where ``n`` is the number of points and ``d`` their dimension :param epsilon: epsilon in the rdp algorithm :type epsilon: float :param dist: distance function :type dist: function with signature ``f(point, start, end)`` -- see :func:`rdp.pldist` :param algo: either ``iter`` for an iterative algorithm or ``rec`` for a recursive algorithm :type algo: string :param return_mask: return mask instead of simplified array :type return_mask: bool """ if algo == "iter": algo = partial(rdp_iter, return_mask=return_mask) elif algo == "rec": if return_mask: raise NotImplementedError( "return_mask=True not supported with algo=\"rec\"") algo = rdp_rec if "numpy" in str(type(M)): return algo(M, epsilon, dist) return algo(np.array(M), epsilon, dist).tolist() 該源代碼中,計(jì)算每個(gè)點(diǎn)到直線(xiàn)距離封裝成了一個(gè)單獨(dú)的函數(shù): 然后根據(jù)迭代、或者遞歸這兩種編程技巧(方式)分別實(shí)現(xiàn)了 rdp,供用戶(hù)選擇。默認(rèn)選擇是遞歸實(shí)現(xiàn)。 同樣閾值(epsilon)的設(shè)置非常重要,下面有三張?jiān)诓煌撝登闆r下,進(jìn)行曲線(xiàn)(折線(xiàn))簡(jiǎn)化后的可視化對(duì)比:其中藍(lán)色曲線(xiàn)是原始數(shù)據(jù)的曲線(xiàn),紅色是簡(jiǎn)化后的曲線(xiàn)。 當(dāng)閾值為0時(shí),僅會(huì)舍棄一條直線(xiàn)上的多余節(jié)點(diǎn) 隨著閾值從0到1再到4,可以直觀(guān)的看到曲線(xiàn)(藍(lán)色)從平滑慢慢變得菱角分明(紅色),魚(yú)與熊掌不可得兼,既要精簡(jiǎn)的數(shù)據(jù)量,又要盡可能的好看平滑,這就是算法開(kāi)發(fā)人員的“白鯨”,夢(mèng)想魚(yú)與熊掌的最大兼得。 隨便一提,閾值為0可用于刪減同一條直線(xiàn)上的多余的點(diǎn)。 最后新的一天從新的知識(shí)開(kāi)始,今天你又學(xué)廢了嘛? 其實(shí)我還想寫(xiě)一個(gè)結(jié)合 OGR 庫(kù)使用 RDP 算法處理 shp 數(shù)據(jù)的小案例,結(jié)果發(fā)現(xiàn)字可能挺多的,就以后有機(jī)會(huì)再寫(xiě)吧! |
|
來(lái)自: GIS薈 > 《待分類(lèi)》