乡下人产国偷v产偷v自拍,国产午夜片在线观看,婷婷成人亚洲综合国产麻豆,久久综合给合久久狠狠狠9

  • <output id="e9wm2"></output>
    <s id="e9wm2"><nobr id="e9wm2"><ins id="e9wm2"></ins></nobr></s>

    • 分享

      用Twitter的cursor方式進(jìn)行Web數(shù)據(jù)分頁 – Tim[后端技術(shù)]

       ShangShujie 2010-05-06

      用 Twitter的cursor方式進(jìn)行Web數(shù)據(jù)分頁

      本文討論Web應(yīng)用中實(shí)現(xiàn)數(shù)據(jù)分頁功能,不同的技術(shù)實(shí)現(xiàn)方式的性能方區(qū)別。

      上圖功能的技術(shù)實(shí)現(xiàn)方法拿MySQL來舉例就是

      select * from msgs where thread_id = ? limit page * count, count

      不過在看Twitter API的時(shí)候,我們卻發(fā)現(xiàn)不少接口使用cursor的方法,而不用page, count這樣直觀的形式,如 followers ids 接口

      URL:

      http://twitter.com/followers/ids.format

      Returns an array of numeric IDs for every user following the specified user.

      Parameters:
      * cursor. Required. Breaks the results into pages. Provide a value of -1 to begin paging. Provide values as returned to in the response body’s next_cursor and previous_cursor attributes to page back and forth in the list.
      o Example: http://twitter.com/followers/ids/barackobama.xml?cursor=-1
      o Example: http://twitter.com/followers/ids/barackobama.xml?cursor=-1300794057949944903

      http://twitter.com/followers/ids.format

      從上面描述可以看到,http://twitter.com/followers/ids.xml 這個(gè)調(diào)用需要傳cursor參數(shù)來進(jìn)行分頁,而不是傳統(tǒng)的 url?page=n&count=n的形式。這樣做有什么優(yōu)點(diǎn)呢?是否讓每個(gè)cursor保持一個(gè)當(dāng)時(shí)數(shù)據(jù)集的鏡像?防止由于結(jié)果集實(shí)時(shí)改變而 產(chǎn)生查詢結(jié)果有重復(fù)內(nèi)容?
      在Google Groups這篇Cursor Expiration討論中Twitter的架構(gòu)師John Kalucki提到

      A cursor is an opaque deletion-tolerant index into a Btree keyed by source
      userid and modification time. It brings you to a point in time in the
      reverse chron sorted list. So, since you can’t change the past, other than
      erasing it, it’s effectively stable. (Modifications bubble to the top.) But
      you have to deal with additions at the list head and also block shrinkage
      due to deletions, so your blocks begin to overlap quite a bit as the data
      ages. (If you cache cursors and read much later, you’ll see the first few
      rows of cursor[n+1]’s block as duplicates of the last rows of cursor[n]’s
      block. The intersection cardinality is equal to the number of deletions in
      cursor[n]’s block). Still, there may be value in caching these cursors and
      then heuristically rebalancing them when the overlap proportion crosses some
      threshold.

      在另外一篇new cursor-based pagination not multithread-friendly中John又提到

      The page based approach does not scale with large sets. We can no
      longer support this kind of API without throwing a painful number of
      503s.

      Working with row-counts forces the data store to recount rows in an O
      (n^2) manner. Cursors avoid this issue by allowing practically
      constant time access to the next block. The cost becomes O(n/
      block_size) which, yes, is O(n), but a graceful one given n < 10^7 and
      a block_size of 5000. The cursor approach provides a more complete and
      consistent result set.

      Proportionally, very few users require multiple page fetches with a
      page size of 5,000.

      Also, scraping the social graph repeatedly at high speed is could
      often be considered a low-value, borderline abusive use of the social
      graph API.

      通過這兩段文字我們已經(jīng)很清楚了,對于大結(jié)果集的數(shù)據(jù),使用cursor方式的目的主要是為了極大地提高性能。還是拿MySQL為例說明,比如翻頁 到100,000條時(shí),不用cursor,對應(yīng)的SQL為

      select * from msgs limit 100000, 100

      在一個(gè)百萬記錄的表上,第一次執(zhí)行這條SQL需要5秒以上。
      假定我們使用表的主鍵的值作為cursor_id, 使用cursor分頁方式對應(yīng)的SQL可以優(yōu)化為

      select * from msgs where id > cursor_id limit 100;

      同樣的表中,通常只需要100ms以下, 效率會(huì)提高幾十倍。MySQL limit性能差別也可參看我3年前寫的一篇不成熟的文章 MySQL LIMIT 的性能問題。

      結(jié)論

      建議Web應(yīng)用中大數(shù)據(jù)集翻頁可以采用這種cursor方式,不過此方法缺點(diǎn)是翻頁時(shí)必須連續(xù),不能跳頁。

      4 Comments  »

      1. pi1ot says:

        實(shí)際應(yīng)用中問題一般是出在where和limit之間的status = pass或者其他篩選條件上,數(shù)據(jù)不連續(xù)cursor也就不那么靈光了

      2. fff says:

        ls應(yīng)該指order by吧,當(dāng)不是以id為序時(shí)
        跳頁可以加一次運(yùn)算,取出合適cursor就可以了吧,相對可能還是簡單了

      3. 超群.com says:

        可以看一下我的一篇博客http://www./2009/04/efficient- pagination-using-mysql/

        既可用到cursor,亦可隨意翻頁。

      4. gen says:

        請問如果不是以主鍵id為排序,應(yīng)該怎么做呢?

        本站是提供個(gè)人知識管理的網(wǎng)絡(luò)存儲(chǔ)空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請點(diǎn)擊一鍵舉報(bào)。
        轉(zhuǎn)藏 分享 獻(xiàn)花(0

        0條評論

        發(fā)表

        請遵守用戶 評論公約

        類似文章 更多