Overview
This guide shows how to auto-detect popular candlestick patterns with Python and pandas. It focuses on practical, vectorized rules you can run on OHLC data for algorithmic trading.
Patterns covered:
- Doji
- Hammer and Shooting Star
- Bullish/Bearish Engulfing
- Morning Star (3-candle)
Quickstart
- Install dependencies: pandas and numpy
- Load OHLC data (open, high, low, close)
- Compute body and wick metrics
- Apply vectorized boolean rules for each pattern
- Optionally filter by trend (e.g., a moving average)
Example setup:
# pip install pandas numpy
import pandas as pd
import numpy as np
# Example: load from CSV with columns: timestamp,open,high,low,close
# df = pd.read_csv("ohlc.csv", parse_dates=["timestamp"]).set_index("timestamp")
Minimal working example
Self-contained script that constructs a small OHLC set and detects the patterns.
import pandas as pd
import numpy as np
# Build a tiny OHLC sample with known patterns
rows = [
# o, h, l, c # comment
(100.00, 101.00, 99.50, 100.02), # 0 doji
(100.10, 100.60, 99.90, 100.40), # 1 small bull
(102.00, 102.20, 101.20, 101.50), # 2 small bear
(101.40, 103.50, 101.30, 103.20), # 3 bull engulfing vs 2
(103.00, 103.20, 101.50, 103.10), # 4 hammer
(103.20, 103.80, 102.80, 103.00), # 5 neutral
(104.00, 104.20, 101.80, 102.00), # 6 long bear
(101.90, 102.30, 101.50, 102.10), # 7 small body near lows
(102.20, 104.70, 102.00, 104.50), # 8 strong bull -> morning star
(104.80, 105.80, 104.10, 104.20), # 9 shooting star
]
df = pd.DataFrame(rows, columns=["open","high","low","close"]).astype(float)
# Derived metrics
rng = (df["high"] - df["low"]).replace(0, np.finfo(float).eps)
body = (df["close"] - df["open"]).abs()
upper_wick = df["high"] - df[["open","close"]].max(axis=1)
lower_wick = df[["open","close"]].min(axis=1) - df["low"]
# Core booleans
bull = df["close"] > df["open"]
bear = df["close"] < df["open"]
# 1) Doji: tiny body relative to range
DOJI_THR = 0.1
df["doji"] = (body / rng) <= DOJI_THR
# 2) Hammer: long lower wick, small upper wick, close in upper part
df["hammer"] = (
(lower_wick >= 2 * body) &
(upper_wick <= body) &
((df["close"] - df["low"]) / rng > 0.5)
)
# 3) Shooting star: mirror of hammer
df["shooting_star"] = (
(upper_wick >= 2 * body) &
(lower_wick <= body) &
((df["high"] - df["close"]) / rng > 0.5)
)
# 4) Engulfing (2-candle)
prev_open = df["open"].shift(1)
prev_close = df["close"].shift(1)
prev_body = (prev_close - prev_open).abs()
# Bullish engulfing: prev bear, curr bull, body engulfs prior body
df["bull_engulf"] = (
(prev_close < prev_open) & bull &
(df["open"] <= prev_close) & (df["close"] >= prev_open) &
(body > prev_body)
)
# Bearish engulfing: prev bull, curr bear, body engulfs prior body
df["bear_engulf"] = (
(prev_close > prev_open) & bear &
(df["open"] >= prev_close) & (df["close"] <= prev_open) &
(body > prev_body)
)
# 5) Morning star (3-candle): bear -> small -> strong bull closing above mid of day -2
prev2_open = df["open"].shift(2)
prev2_close = df["close"].shift(2)
prev1_open = df["open"].shift(1)
prev1_close = df["close"].shift(1)
prev1_body = (prev1_close - prev1_open).abs()
prev1_rng = (df["high"].shift(1) - df["low"].shift(1)).replace(0, np.finfo(float).eps)
small_prev1 = (prev1_body / prev1_rng) < 0.3
mid_prev2 = (prev2_open + prev2_close) / 2
df["morning_star"] = (
(prev2_close < prev2_open) & # day -2 is bearish
small_prev1 & # day -1 small-bodied
bull & # current is bullish
(df["close"] > mid_prev2) # close above midpoint of day -2
)
patterns = [
"doji", "hammer", "shooting_star", "bull_engulf", "bear_engulf", "morning_star"
]
print(df[patterns])
print("\nDetections (rows with any pattern):")
print(df.loc[df[patterns].any(axis=1), ["open","high","low","close"] + patterns])
Expected: row 0 (doji), 3 (bull_engulf), 4 (hammer), 8 (morning_star), 9 (shooting_star) flag True.
Steps to add this to your pipeline
- Load and clean OHLC
- Ensure columns open, high, low, close are float.
- Validate: high >= max(open, close), low <= min(open, close).
- Drop duplicates and sort by timestamp.
- Compute derived columns
- Range, body, upper/lower wicks.
- Use replace(0, eps) to prevent division-by-zero.
- Implement pattern rules
- Prefer vectorized boolean expressions over loops.
- Parameterize thresholds (e.g., doji body/range <= 0.1).
- Add context filters (optional)
- Example: require a prior downtrend for bullish reversals.
ma = df["close"].rolling(20, min_periods=20).mean()
df["downtrend"] = df["close"] < ma
signals = df["hammer"] & df["downtrend"]
- Aggregate and act
- Combine signals into a column, forward-fill only if necessary.
- Align with entry/exit rules to avoid look-ahead bias.
Pattern rules used (concise)
Let body = |close - open|, range = high - low.
- Doji: body/range <= 0.1
- Hammer: lower_wick >= 2*body, upper_wick <= body, close near upper half
- Shooting star: upper_wick >= 2*body, lower_wick <= body, close near lower half
- Bullish engulfing: prev bear, curr bull, curr body engulfs prev body
- Bearish engulfing: prev bull, curr bear, curr body engulfs prev body
- Morning star: bear (t-2) -> small (t-1) -> bull (t) closing above midpoint of (t-2)
Tune thresholds to your instrument and timeframe.
Pitfalls and validation
- Data quality: Ensure high/low aren’t inverted; handle missing bars and outliers.
- Adjusted data: Corporate actions can distort candles; use adjusted OHLC for equities.
- Gaps: Some patterns assume gaps; in 24/7 markets they’re rare—adapt rules.
- Look-ahead bias: Don’t use the same bar’s close for entry; act on next bar.
- Timezone/holidays: Align sessions; avoid mixing RTH with after-hours unless intended.
- Resampling: When building higher timeframes, use proper OHLC aggregation, not mean.
- Parameter sensitivity: Small threshold tweaks can change signals; backtest ranges.
Performance notes
- Vectorize: Use pandas/numpy boolean arrays; avoid Python loops and row-wise apply.
- Memory: Use float32 where precision allows; drop intermediates after computing signals.
- Chunking: For very large histories, process in chunks and write signals to disk.
- Grouped runs: For multi-symbol datasets, groupby("symbol") and apply vectorized rules per group.
- Parallelism: Parallelize across symbols or time windows with multiprocessing.
- Alternatives: Libraries like TA-Lib or pandas_ta provide prebuilt pattern functions; benchmark them vs your vectorized rules.
Extending to more patterns
- Harami (inside bar body): current body is contained within previous body.
- Piercing/Dark Cloud Cover: two-candle penetrations of prior body.
- Three White Soldiers/Black Crows: sequences of strong candles in the same direction.
Implement by combining shifts, body thresholds, and relative close conditions.
Tiny FAQ
Q: Do I need TA-Lib to detect patterns? A: No. Vectorized pandas/numpy rules work well. TA-Lib can be a drop-in alternative.
Q: Which timeframe does this work on? A: Any OHLC timeframe. Calibrate thresholds per market and timeframe.
Q: How do I avoid repainting? A: Compute patterns on the close of bar t and execute on bar t+1.
Q: Should I trade signals directly? A: Validate with backtests, trend/context filters, and risk management.
Q: Why don’t my signals match charting tools? A: Different definitions and thresholds. Align rules and session settings precisely.